blast sequences queried against the nr or grass databases. go analysis

1
BLAST Sequences queried against the nr or grass databases. GO ANALYSIS Contigs classified based on homology to known plant or fungal genes. 1 2 3 4 5 Next Generation Sequencing On Species Important to the Turfgrass Industry Keenan Amundsen 1,2 , Geunhwa Jung 3 , Dilip Lakshman 2 , Scott Warnke 2 1 George Mason University, 2 United States Department of Agriculture, 3 University of Massachusetts SEQUENCE ASSEMBLY 36-mer sequences were assembled into contigs using the software programs Edena and Velvet. SAMPLE PREPARATION Preparation of samples for sequencing (approximately 3 days). Isolate mRNA cDNA Synthesis Ligate Adaptoers Cluster Station Total RNA DNA Sequencing Illumina sequencing run and data analysis (approximately 5 days). Normalize Intensitie s DNA Basecalls Illumina GAII Species 36-mers Edena Contig s Velvet Contigs Unique Contigs Agrostis stolonifera 1 401,308 1 24 25 Agrostis stolonifera 2 598,095 133 335 362 Agrostis stolonifera 3 - - - - Agrostis canina 26,880 0 0 0 Sclerotinia homoeocarpa 1 881,450 430 1,456 1,413 Sclerotinia homoeocarpa 2 953,211 427 1,582 1,534 Rhizoctonia solani 326,174 62 258 258 Total 3,187,11 8 1,053 3,655 3,592 Species Sequences Hit s Unique Agrostis Grass Unknown 1 A. stolonifera 1 25 19 6 22 2 1 2 A. stolonifera 2 362 308 54 309 39 14 3 S. homoeocarpa 1 1,413 626 787 4 S. homoeocarpa 2 1,534 740 794 ND 5 R. solani 258 83 175 GENOTYPES Total RNA was extracted from 4 Agrostis, 2 Sclerotinia, and 1 Rhizoctonia samples. 3 Agrostis stolonifera 2 Sclerotinia homoeocarpa 1 Rhizoctonia solani 1 Agrostis canina Causes Dollar Spot Causes Brown Patch INSTRUMENTATION Sequencing reactions were performed on the Illumina platform. Cluster station Illumina GAII PROBLEM There is limited publicly available DNA sequence data for several species important to the turfgrass industry. •What is the practicality of implementing next generation sequencing technologies to expand the collection of sequence data for these species? •Can we use next generation sequencing for sequence assembly and comparative genomics without a reference sequence? •Can we develop novel genetic markers based on next generation sequence data? •Can we use next generation sequencing for expression profiling? ABSTRACT The bentgrasses (Agrostis spp.) are important species to the turfgrass industry because of their unique growth and aesthetic characteristics that make them ideally suited for use on golf course tees, fairways, and putting greens. Molecular marker development is difficult in Agrostis due to the limited amount of available DNA sequence data. For example there are 16,992 Agrostis EST sequences (as of April 23, 2009) published in the National Center for Biotechnology Information databases, dwarfed by the millions of EST sequences available for cereal grasses. Illumina sequencing technology is among the most popular of the next generation sequencing technologies and provides an affordable way of producing large amounts of sequence data. The objective of this study was to evaluate the Illumina Genome Analyzer for the generation of EST sequence data from one velvet bentgrass (A. canina L.), three creeping bentgrasses (A. stolonifera L.), two dollar spot causing fungal isolates (Sclerotinia homeocarpa FT Bennett), and one brown patch causing fungal isolate (Rhizoctonia solani Kühn). In addition, the feasibility of assembling the short Illumina sequencing reads into usable data was tested. A total of 1,026,283 (Agrostis), 1,834,661 (Sclerotinia), and 326,174 (Rhizoctonia) 36-mer sequences were generated. The software programs Edena and Velvet were used to assemble the sequences into contigs. A total of 387 contigs were assembled for the turfgrass libraries, 2,947 for the Sclerotinia libraries, and 258 for the Rhizoctonia library. While this sequencing run generated approximately 10 percent of the expected amount of data, more than 3,000 EST sequences were recovered. This preliminary experiment demonstrates the utility of high throughput sequencing on species important to the turfgrass industry and the need for additional sequencing. CONCLUSIONS Next generation sequencing is a viable sequencing technology for studying species important to the turfgrass industry. In this study, we were able to assemble 36-mers into contigs without a reference sequence, characterize the sequences, develop SNP markers in Agrostis, and compare the relative expression of conserved ESTs between two S. homoeocarpa isolates. It is estimated that a successful sequencing run would provide 10-fold more 36-mers and yield significantly more assembled contigs. ACKNOWLEDGEMENTS Thanks are given to the United States Golf Association and to the United States Department of Agriculture for supporting this research. FUTURE DIRECTION •Expand DNA sequence knowledgebase of important turf and fungal species. •Profile differentially expressed genes in response to disease and abiotic stress. •Develop genetic markers for marker assisted selection and linkage map development. •Monitor expression of siRNAs and their influence on gene network regulation. •Explore species relationships through comparative genomics and phylogenetic studies. •Address concerns for data management. SNP Design Identified single nucleotide polymorphisms to be used as genetic markers. Identified 56 SNPs by comparing the A. canina 36-mer sequence data to published A. capillaris and A. stolonifera EST sequences. Used the published full length Agrostis EST sequences to design molecular beacons for SNP detection. A.canina CGCCGCCATGCCTTACACGGGGATTACATGAGAAGA A.stolonifera CGCCGCCATGCCTTACACGGGGATTACATGAGAAGA A.capillaris CGCCGCCATGCCTTACACGGGGATTACATGGGAAGA ****************************** ***** A.capillaris ATCGCCGCCATGCCTTACACGGGGATTACATGGGAAGACTTAGAGCGAGAGGCCGCCGGACTCCTCGTCCTCG ||||||||||||||||||||||||||||||||.|||||||||||||||||||||||||||||||||||||||| A.stolonifera ATCGCCGCCATGCCTTACACGGGGATTACATGAGAAGACTTAGAGCGAGAGGCCGCCGGACTCCTCGTCCTCG EXPRESSION ANALYSIS Relative expression of 232 conserved S.homoeocarpa ESTs expressed as number of 36-mer reads/kb. S. homoeocarpa 1 S. homoeocarpa 2 Each vertical line represents one conserved EST. Sequences expressed equally in each library should have equal length ( ) and ( ) bars.

Upload: infinity

Post on 18-Jan-2016

39 views

Category:

Documents


1 download

DESCRIPTION

Next Generation Sequencing On Species Important to the Turfgrass Industry Keenan Amundsen 1,2 , Geunhwa Jung 3 , Dilip Lakshman 2 , Scott Warnke 2 1 George Mason University, 2 United States Department of Agriculture, 3 University of Massachusetts. SAMPLE PREPARATION - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BLAST Sequences queried against the nr or grass databases. GO ANALYSIS

BLASTSequences queried against the nr or grass databases.

GO ANALYSISContigs classified based on homology to known plant or fungal genes. 1 2 3 4 5

Next Generation Sequencing On Species Important to the Turfgrass IndustryKeenan Amundsen1,2, Geunhwa Jung3, Dilip Lakshman2, Scott Warnke2

1George Mason University, 2United States Department of Agriculture, 3University of Massachusetts

SEQUENCE ASSEMBLY36-mer sequences were assembled into contigs using the software programs Edena and Velvet.

SAMPLE PREPARATIONPreparation of samples for sequencing (approximately 3 days).

Isol

ate

mRN

AIs

olat

e m

RNA

cDN

A Sy

nthe

sis

cDN

A Sy

nthe

sis

Liga

te A

dapt

oers

Liga

te A

dapt

oers

Clus

ter S

tatio

nCl

uste

r Sta

tion

Tota

l RN

ATo

tal R

NA

DNA SequencingIllumina sequencing run and data analysis (approximately 5 days).

Nor

mal

ize

Inte

nsiti

esN

orm

aliz

e In

tens

ities

DN

A Ba

seca

llsD

NA

Base

calls

Illum

ina

GAI

IIll

umin

a G

AII

Species 36-mers Edena Contigs

Velvet Contigs

Unique Contigs

Agrostis stolonifera 1 401,308 1 24 25Agrostis stolonifera 2 598,095 133 335 362Agrostis stolonifera 3 - - - -Agrostis canina 26,880 0 0 0Sclerotinia homoeocarpa 1 881,450 430 1,456 1,413Sclerotinia homoeocarpa 2 953,211 427 1,582 1,534Rhizoctonia solani 326,174 62 258 258Total 3,187,118 1,053 3,655 3,592

Species Sequences Hits Unique Agrostis Grass Unknown

1 A. stolonifera 1 25 19 6 22 2 12 A. stolonifera 2 362 308 54 309 39 143 S. homoeocarpa 1 1,413 626 7874 S. homoeocarpa 2 1,534 740 794 ND5 R. solani 258 83 175

GENOTYPESTotal RNA was extracted from 4 Agrostis, 2 Sclerotinia, and 1 Rhizoctonia samples.

3 Agrostis stolonifera 2 Sclerotinia homoeocarpa 1 Rhizoctonia solani 1 Agrostis canina Causes Dollar Spot Causes Brown Patch

INSTRUMENTATIONSequencing reactions were performed on the Illumina platform.

Cluster station Illumina GAII

PROBLEMThere is limited publicly available DNA sequence data for several species important to the turfgrass industry.

•What is the practicality of implementing next generation sequencing technologies to expand the collection of sequence data for these species?•Can we use next generation sequencing for sequence assembly and comparative genomics without a reference sequence?•Can we develop novel genetic markers based on next generation sequence data?•Can we use next generation sequencing for expression profiling?

ABSTRACTThe bentgrasses (Agrostis spp.) are important species to the turfgrass industry because of their unique growth and aesthetic characteristics that make them ideally suited for use on golf course tees, fairways, and putting greens. Molecular marker development is difficult in Agrostis due to the limited amount of available DNA sequence data. For example there are 16,992 Agrostis EST sequences (as of April 23, 2009) published in the National Center for Biotechnology Information databases, dwarfed by the millions of EST sequences available for cereal grasses. Illumina sequencing technology is among the most popular of the next generation sequencing technologies and provides an affordable way of producing large amounts of sequence data. The objective of this study was to evaluate the Illumina Genome Analyzer for the generation of EST sequence data from one velvet bentgrass (A. canina L.), three creeping bentgrasses (A. stolonifera L.), two dollar spot causing fungal isolates (Sclerotinia homeocarpa FT Bennett), and one brown patch causing fungal isolate (Rhizoctonia solani Kühn). In addition, the feasibility of assembling the short Illumina sequencing reads into usable data was tested. A total of 1,026,283 (Agrostis), 1,834,661 (Sclerotinia), and 326,174 (Rhizoctonia) 36-mer sequences were generated. The software programs Edena and Velvet were used to assemble the sequences into contigs. A total of 387 contigs were assembled for the turfgrass libraries, 2,947 for the Sclerotinia libraries, and 258 for the Rhizoctonia library. While this sequencing run generated approximately 10 percent of the expected amount of data, more than 3,000 EST sequences were recovered. This preliminary experiment demonstrates the utility of high throughput sequencing on species important to the turfgrass industry and the need for additional sequencing.

CONCLUSIONSNext generation sequencing is a viable sequencing technology for studying species important to the turfgrass industry. In this study, we were able to assemble 36-mers into contigs without a reference sequence, characterize the sequences, develop SNP markers in Agrostis, and compare the relative expression of conserved ESTs between two S. homoeocarpa isolates. It is estimated that a successful sequencing run would provide 10-fold more 36-mers and yield significantly more assembled contigs.

ACKNOWLEDGEMENTSThanks are given to the United States Golf Association and to the United States Department of Agriculture for supporting this research.

FUTURE DIRECTION

•Expand DNA sequence knowledgebase of important turf and fungal species.

•Profile differentially expressed genes in response to disease and abiotic stress.

•Develop genetic markers for marker assisted selection and linkage map development.

•Monitor expression of siRNAs and their influence on gene network regulation.

•Explore species relationships through comparative genomics and phylogenetic studies.

•Address concerns for data management.

SNP DesignIdentified single nucleotide polymorphisms to be used as genetic markers.

Identified 56 SNPs by comparing the A. canina 36-mer sequence data to published A. capillaris and A. stolonifera EST sequences.

Used the published full length Agrostis EST sequences to design molecular beacons for SNP detection.

A.canina CGCCGCCATGCCTTACACGGGGATTACATGAGAAGAA.stolonifera CGCCGCCATGCCTTACACGGGGATTACATGAGAAGAA.capillaris CGCCGCCATGCCTTACACGGGGATTACATGGGAAGA ****************************** *****

A.capillaris ATCGCCGCCATGCCTTACACGGGGATTACATGGGAAGACTTAGAGCGAGAGGCCGCCGGACTCCTCGTCCTCG ||||||||||||||||||||||||||||||||.||||||||||||||||||||||||||||||||||||||||A.stolonifera ATCGCCGCCATGCCTTACACGGGGATTACATGAGAAGACTTAGAGCGAGAGGCCGCCGGACTCCTCGTCCTCG

EXPRESSION ANALYSISRelative expression of 232 conserved S.homoeocarpa ESTs expressed as number of 36-mer reads/kb.

S. homoeocarpa 1

S. homoeocarpa 2

Each vertical line represents one conserved EST. Sequences expressed equally in each library should have equal length ( ) and ( ) bars.