applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. progress in the development of...

9
Applica’on of transcriptomics to plant breeding Advances in genetics and molecular breeding of three legume crops of semi-arid tropics using next-generation sequencing and high-throughput genotyping technologies RAJEEV KVARSHNEY 1,2, * , HIMABINDU KUDAPA 1 , MANISH ROORKIWAL 1 , MAHENDAR THUDI 1 , MANISH KPANDEY 1 , RACHIT KSAXENA 1 , SIVA KCHAMARTHI 1 , MURALI MOHAN S 1 , NALINI MALLIKARJUNA 1 , HARI UPADHYAYA 1 , POORAN MGAUR 1 , LKRISHNAMURTHY 1 , KB SAXENA 1 , SHYAM NNIGAM 1 and SURESH PANDE 1 1 International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502 324, India 2 CGIAR Generation Challenge Programme (GCP), c/o CIMMYT, 06600, Mexico, DF, Mexico *Corresponding author (Fax, 0091-40-30713074/75; Email, [email protected])

Upload: others

Post on 01-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics(SAT),namely,

Applica'on  of  transcriptomics  to  plant  breeding  

Advances in genetics and molecular breeding of three legume cropsof semi-arid tropics using next-generation sequencing

and high-throughput genotyping technologies

RAJEEV K VARSHNEY1,2,*,HIMABINDU KUDAPA

1,MANISH ROORKIWAL

1,MAHENDAR THUDI

1,

MANISH K PANDEY1 ,RACHIT K SAXENA1 ,SIVA K CHAMARTHI1,MURALI MOHAN S1 ,

NALINI MALLIKARJUNA1,HARI UPADHYAYA

1,POORAN M GAUR

1,L KRISHNAMURTHY

1,KB SAXENA1 ,

SHYAM N NIGAM1 and SURESH PANDE1

1International Crops Research Institute for the Semi-Arid Tropics (ICRISAT),Patancheru 502 324, India

2CGIAR Generation Challenge Programme (GCP), c/o CIMMYT, 06600, Mexico,DF, Mexico

*Corresponding author (Fax, 0091-40-30713074/75; Email, [email protected])

Molecular markers are the most powerful genomic tools to increase the efficiency and precision of breeding practicesfor crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-aridtropics (SAT), namely, chickpea (Cicer arietinum), pigeonpea (Cajanus cajan) and groundnut (Arachis hypogaea), ascompared to other crop species like cereals, has been very slow. With the advances in next-generation sequencing(NGS) and high-throughput (HTP) genotyping methods, there is a shift in development of genomic resourcesincluding molecular markers in these crops. For instance, 2,000 to 3,000 novel simple sequence repeats (SSR)markers have been developed each for chickpea, pigeonpea and groundnut. Based on Sanger, 454/FLX andIllumina transcript reads, transcriptome assemblies have been developed for chickpea (44,845 transcriptassembly contigs, or TACs) and pigeonpea (21,434 TACs). Illumina sequencing of some parental genotypesof mapping populations has resulted in the development of 120 million reads for chickpea and 128.9 millionreads for pigeonpea. Alignment of these Illumina reads with respective transcriptome assemblies haveprovided >10,000 SNPs each in chickpea and pigeonpea. A variety of SNP genotyping platforms includingGoldenGate, VeraCode and Competitive Allele Specific PCR (KASPar) assays have been developed inchickpea and pigeonpea. By using above resources, the first-generation or comprehensive genetic maps havebeen developed in the three legume species mentioned above. Analysis of phenotyping data together with genotyping datahas provided candidate markers for drought-tolerance-related root traits in chickpea, resistance to foliar diseases ingroundnut and sterility mosaic disease (SMD) and fertility restoration in pigeonpea. Together with these trait-associated markers along with those already available, molecular breeding programmes have been initiated forenhancing drought tolerance, resistance to fusarium wilt and ascochyta blight in chickpea and resistance tofoliar diseases in groundnut. These trait-associated robust markers along with other genomic resources includinggenetic maps and genomic resources will certainly accelerate crop improvement programmes in the SAT legumes.

http://www.ias.ac.in/jbiosci J. Biosci. 37(5), November 2012, 811–820, * Indian Academy of Sciences 811

Keywords. Chickpea; genomic resource; genotyping; groundnut; legume; pigeonpea; sequencing

Abbreviations used: AB, ascochyta blight; AFLP, amplified fragment length polymorphism; BAC, bacterial artificial chromosome; BES,BAC- end derived sequence; CAPS, cleaved amplified polymorphic sequences; CISR, conserved intron spanning region; COS, conservedorthologous sequence; DArT, diversity array technology; FW, Fusarium wilt; GMM, genic molecular marker; GS, Genomic selection; HTP, high-throughput; LLS, late leaf spot; MABC, marker-assisted backcrossing; MAS, marker-assisted selection; NGS, next-generation sequencing; RAPD,random amplified polymorphic DNA; RFLP, restriction fragment length polymorphism; RIL, recombinant inbred line; SAT, semi-arid tropics;SCMR, SPAD chlorophyll meter readings; SLA, specific leaf area; SMD, sterility mosaic disease; SNP, single nucleotide polymorphism; SPAD,soil plant analytical development; SSR, simple sequence repeats; TAC, transcript assembly contig; TE, transpiration efficiency; TUS, tentativeunique sequences

Published online: 15 October 2012

Page 2: Applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics(SAT),namely,

Applica'on  of  transcriptomics  to  plant  breeding  

Legumes  include  a  number  of  important  crops,  namely,  soybean  (Glycine  max),  groundnut  (Arachis   hypogaea),   cowpea   (Vigna   unguiculata),   common   bean   (Phaseolus   vulgaris),  chickpea   (Cicer   arie9num),   pigeonpea   (Cajanus   cajan),   len'l   (Lens   culinaris),   pea   (Pisum  sa9vum),  mungbean  (Vigna  radiata),  etc.    

The  global  produc'on  of  grain  and  forage  legumes  is  about  300  million  metric  tons  which  are  grown  on  about  190  million  hectares.  Legumes  are  a  rich  source  of  proteins,  vitamins,  minerals  and  dietary  fibre  

Page 3: Applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics(SAT),namely,

Applica'on  of  transcriptomics  to  plant  breeding  Several   food   legume   crops   are   grown   in   semi-­‐arid   tropics   (SAT)   of   Africa,   Asia   and   South  America.  In  these  areas,  the  legume  crops  are  exposed  to  various  bio'c  and  abio'c  stresses.  Crop   produc)vity   of   these   legume   crops   can   be   enhanced   through   the   use   of  biotechnological  tools  in  the  breeding  programmes.  

Marker   trait   associa'ons   are   a   prerequisite   for   marker-­‐assisted   selec'on   (MAS).   Marker  trait  associa'on  for  a  number  of  traits  in  all  major  crops  have  now  become  available  due  to  the   accessibility   of   an   array   of   molecular   markers   and   dense   molecular   gene'c   maps.  However,  majority  of  the  legume  crops  remained  untouched  with  genomics  revolu)on    

Page 4: Applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics(SAT),namely,

Applica'on  of  transcriptomics  to  plant  breeding  

Two   NGS   technologies   namely   454   and   Illumina   together   with   Sanger   sequencing  technology   have   been   used   to   characterize   the   transcriptomes   of   chickpea   and  pigeonpea.   20.162   and   9.888   ESTs   were   developed   for   chickpea   and   pigeonpea   on  drought-­‐  and  salinity-­‐challenged  cDNA   libraries   for  chickpea  and  Fusarium  wilt   (FW)  and  sterility  mosaic  disease  (SMD)-­‐challenged  cDNA  libraries  for  pigeonpea.    

In  order  to  improve  these  transcriptomic  resources,  454/FLX  sequencing  was  undertaken  on  normalized  and  pooled  RNA  samples  collected  from  >20  'ssues  represen'ng  different  developmental  stages  of  the  plant.  As  a  result,  435.018  transcript  reads  for  chickpea  and  494.353   transcript   reads   for   pigeonpea   have   been   generated.  Cluster   analysis   of   these  transcript  reads  provided  transcript  assembly  of  chickpea  with  103.215  tenta)ve  unique  sequences  (TUSs)  and  pigeonpea  with  127.754  TUSs    

Page 5: Applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics(SAT),namely,

Applica'on  of  transcriptomics  to  plant  breeding  Illumina   sequencing   was   carried   out   on   parental   genotypes   of   mapping   popula'ons   of  chickpea  and  pigeonpea.  Alignment  of  these  short  reads  onto  TAs  of  respec)ve  species  has  provided  a  large  number  (tens  of  thousands)  of  SNPs  in  each  of  these  species.  This  allowed  the  development  of  cost-­‐effec)ve  genotyping  plaQorms.  

The  applica)on  of  these  informa)on  in  plant  breeding  programs  let  to  the  mapping  of  QTLs  involved   in   drought   response   in   chickpea   and   in   response   to   Sterility  Mosaic   Disease   in  pigeonpea  

Page 6: Applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics(SAT),namely,

Applica'on  of  transcriptomics  to  plant  breeding  

RESEARCH ARTICLE Open Access

De novo assembly and Characterisation of theTranscriptome during seed development, andgeneration of genic-SSR markers in Peanut(Arachis hypogaea L.)Jianan Zhang1,2, Shan Liang3, Jialei Duan4, Jin Wang1, Silong Chen1, Zengshu Cheng1, Qiang Zhang1,Xuanqiang Liang5 and Yurong Li1*

Abstract

Background: The peanut (Arachis hypogaea L.) is an important oilseed crop in tropical and subtropical regions ofthe world. However, little about the molecular biology of the peanut is currently known. Recently, next-generationsequencing technology, termed RNA-seq, has provided a powerful approach for analysing the transcriptome, andfor shedding light on the molecular biology of peanut.

Results: In this study, we employed RNA-seq to analyse the transcriptomes of the immature seeds of threedifferent peanut varieties with different oil contents. A total of 26.1-27.2 million paired-end reads with lengths of100 bp were generated from the three varieties and 59,077 unigenes were assembled with N50 of 823 bp. Basedon sequence similarity search with known proteins, a total of 40,100 genes were identified. Among these unigenes,only 8,252 unigenes were annotated with 42 gene ontology (GO) functional categories. And 18,028 unigenesmapped to 125 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway database(KEGG). In addition, 3,919 microsatellite markers were developed in the unigene library, and 160 PCR primers of SSRloci were used for validation of the amplification and the polymorphism.

Conclusion: We completed a successful global analysis of the peanut transcriptome using RNA-seq, a largenumber of unigenes were assembled, and almost four thousand SSR primers were developed. These data willfacilitate gene discovery and functional genomic studies of the peanut plant. In addition, this study providesinsight into the complex transcriptome of the peanut and established a biotechnological platform for futureresearch.

BackgroundThe peanut (Arachis hypogaea L.), also known as thegroundnut, is an important oilseed crop in the tropicaland subtropical regions of the world. It is grown on sixcontinents but mainly in Asia, Africa and America. Pea-nuts are cultivated on 23.51 million hectares worldwide,with a total global production of approximately 35.52million tons (the weight includes the shell). China is thelargest producer in the world, accounting for 37.6%

(13.34 million tons) of the total world production (FAO,2009, http://faostat.fao.org).Peanuts have a desirable fatty acid profile and are rich

in vitamins, minerals and bioactive materials, includingseveral known heart-healthy nutrients, such as monoun-saturated and polyunsaturated fatty acids, potassium,magnesium, copper niacin, arginine, fibre, a-tocopherol,folates, phytosterols, and flavonoids. Indeed, peanut con-sumption has been associated with an improvement inthe overall quality of the diet and nutrient [1-4].In China, almost 60% of the peanuts are used to pro-

duce peanut oil [5]. Peanut oil, due to its high monounsa-turated fat content, is considered healthier than saturatedoils and is resistant to rancidity. Monounsaturated fat,

* Correspondence: [email protected] of Food and Oil Crops, Hebei Academy of Agriculture and ForestrySciences/Laboratory of Crop Genetics and Breeding of Hebei Province,Shijiazhuang 050031, ChinaFull list of author information is available at the end of the article

Zhang et al. BMC Genomics 2012, 13:90http://www.biomedcentral.com/1471-2164/13/90

© 2012 Zhang et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the original work is properly cited.

Page 7: Applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics(SAT),namely,

Applica'on  of  transcriptomics  to  plant  breeding  The  peanut   (Arachis  hypogaea   L.),  also  known  as   the  groundnut,   is  an   important  oilseed  crop  in  the  tropical  and  subtropical  regions  of  the  world.  It  is  grown  on  six  con'nents  but  mainly   in   Asia,   Africa   and   America.   Peanuts   are   cul)vated   on   23.51   million   hectares  worldwide,  with  a  total  global  produc)on  of  approximately  35.52  million  tons  

Peanut   consump'on  has  been  associated  with  an   improvement   in   the  overall   quality  of  the  diet  and  nutrient    

The   development   of   the   peanut   seed   has   been   studied   intensely   to   understand   the  physiological,  biochemical,  and  molecular  characteris'cs  that  determine  the  oil  quality  and  their  beneficial  nutri'onal  contribu'ons.  However,  the  development  of  the  peanut  seed  is   a   complex   process   involving   a   cascade   of   biochemical   changes,   which   involve   the  transcrip)onal  modula)on  of  many  genes,  yet  liXle  is  known  about  these  transcrip)onal  changes  and  their  regula)on.    

Page 8: Applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics(SAT),namely,

Applica'on  of  transcriptomics  to  plant  breeding  

We  generated  27.2,  26.9  and  26.1  million  100-­‐bp  paired-­‐  end  reads   for  the  JH4,  K01  and  T21  varie'es,  encompassing  2.44,  2.42  and  2.35  Gb  of  sequence  data,  respec'vely  

Assembling  these  reads  produced  44.028,  47.110  and  44.157  unigenes  for  the  JH4,  K01  and  T21  varie'es,  respec'vely.  AYer  the  final  clustering,  59.077  unigenes  were  obtained.  

The  unigenes  were  compared  against  the  NCBI  NR  protein  database  using  blastx.  Among  the  59.077  unigenes,  40.100   (67.88%)  had  at   least  one  significant  match  with  an  E-­‐value  below  1e-­‐5.  A  total  of  18.977  unigenes  had  no  significant  matches  to  any  known  protein,  the   result   that   may   be   partly   due   to   novel   genes   or   highly   divergent   genes,   or   these  unigenes  could  represent  untranslated  regions.    

Page 9: Applicaon*of*transcriptomics*to*plantbreeding*for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics(SAT),namely,

Applica'on  of  transcriptomics  to  plant  breeding  

Microsatellite  markers  (SSR  markers)  are  some  of  the  most  successful  molecular  markers  in  the  construc'on  of  a  peanut  gene'c  map  and  in  diversity  analysis.  In  this  study,  5.883  microsatellites   were   detected   in   4.993   unigenes,   of   which,   728   sequences   contained  more  than  1  SSR.  

Based  on  the  5,883  SSRs,  3,919  primer  pairs  were  successfully  designed  

A  total  of  160  primer  pairs  were  randomly  selected  to  validate  these  polymorphisms  in  six  varie'es.   All   160   of   the   markers   yielded   amplifica)on   products,   and   65   (40.63%)  exhibited  polymorphisms  among  the  six  varie)es.