clc bio presentation at 5th sfaf 6/3/2010

Post on 11-May-2015

1.175 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

My presentation at the 5th Sequencing FInishing and Analysis in the Future (SFAF -- http://www.lanl.gov/conferences/finishfuture/2010SFAF_Meeting_Guide.pdf) June 3, 2010

TRANSCRIPT

CLC bioA Comprehensive Platform

for NGS Data Analysis

Saul A. Kravitz, PhDDirector of Consulting Services

Before the Flood

2005: $5M Human genome – 19 sequencer years

Sample Prep AnalysisSequencing Science

Nextgen Sequencing Revolution

Sample Prep AnalysisSequencing Science

2010: $6k Human genome ~1 sequencer day

Help!!

Bioinformatics Challenges

•Data Analysis Tools for Biomedical Researchers•GUI-driven•HPC integration

•Unprecedented data volumes•Rapid technology change, applications growth

•Multi-platform data integration•No one-size-fits-all solutions

•Rapid customization and adaptation

CLC bio NGS Analysis Platform

CLC Genomics WorkbenchCLC Genomics Server

CLC Assembly CellDeveloper SDK

Easy to use, Wizard-driven Desktop SoftwareEnterprise solution

High performance NGS algorithms

Workbench and Server Customization

Swiss Army Knife of NGS Analysis

Genomics Transcriptomics EpigenomicsRNA-SeqmiRNA

CHIP-SeqRead MappingDe Novo AssemblySNP/DIP Detection

Visualization

File Format Conversion

Desktop SolutionsEnterpriseSolutions

Traditional Bioinformatics

Intuitive GUISDK

Tools Integration

High Performance

Why not use free tools?

•Are tools free or “free”?

•Tools vs solutions

•True cost of ownership

•Ease of Use

•Tools integration

•Support

Small RNA Analysis(in Beta soon)

•Identify and filter/trim adapters

•annotate using mirBASE and other resources

- target species of interest

•Merge/group by mature, precursor/reference

•Fully integrated with expression analysis

De Novo Assembler

• Human assembly of 38x Illumina paired-end

• CLC Quality equivalent to Abyss

• CLC: 7 hrs, 1 node, 42 Gb of RAM

• Abyss: 80 hrs, 21 nodes, 336 Gb of RAM

• Metagenomics Assembly

• METAHIT Dataset MH0041 40M 75bp paired end

• 3 hrs on desktop, 6 Gb RAM

• Higher N50 and Total Contig Size than Reported

Viral Sequencing at JCVI(See Nadia Fedorova’s Poster!)

• Amplify and Barcode using SISPA, 454 + Illumina Sequencing

• Depth of coverage sometimes >1000x

• De novo Assembly of Consensus for all Segments

• For each segment:

• Map reads from each technology independently using best full length reference from NCBI, call variations

• Update reference with variations confirmed by multiple technologies

• Map reads using updated reference and all reads

• Convert to consed, analyze, order Sanger closure reactions

Source: Jessica Hostetler, Nadia Federova, Tim Stockwell, Danny Katzel

Why CLC bioTools?

• CLC handled hybrid sequencing technologies directly

• Very biased coverage confounded other assemblers that expect random arrival stats.  CLC didn’t seem to suffer from biased coverage. 

• Very accurate SNP calls in areas of deep coverage.

Tim StockwellDirector of Viral InformaticsJ. Craig Venter Institute

Targeted Resequencing QC

•Assessment of targeted sequencing technology

•Coverage Statistics for Targeted Regions

•Very short schedule, limited bioinformatics staff

•Plug-in development leveraging CLC tools to automate the process and meet short deadline

•QC Report now available as plug-in

Professional Services

•Developing customized solutions

•Integration with LIMS, workflows, DB

•Bioinformatics Algorithm Development

•Cloud and Grid Integration

•Data Analysis

Thank you for listening

Saul A. Kravitz, PhDskravitz@clcbio.com(301)355-0813

Questions?

Thank you for listening

Saul A. Kravitz, PhDskravitz @ clcbio.com 301)355-0813

Questions

top related