tutorial 6 high throughput sequencing. hts tools and analysis review of resequencing pipeline...

48
Tutorial 6 High Throughput Sequencing

Upload: eleanore-richardson

Post on 18-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Tutorial 6

High Throughput Sequencing

Page 2: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

HTS tools and analysis

• Review of resequencing pipeline

• Visualization - IGV

• Analysis platform – Galaxy

• Tuning up the pipelines

Page 3: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Review of resequencing pipeline

Page 4: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Demultiplexing

LaneUnknown inserts

Page 5: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Reference Genome

SampleMapping

Demultiplexing

Example of mapping parameters:• Number of mismatches per read • Scores for mismatch or gaps

Mapping parameters affect the rest of the analysis

Page 6: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Removing duplicates and non-unique mappings

Mapping

Demultiplexing

Reference Genome

Reference Genome

?

Page 7: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Resequencing/ Exome Pipeline

Page 8: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Coverage profile and variant calling

Removing duplicates and non-unique mappings

Mapping

Demultiplexing

Reference Genome

…ACTTCGTCGAAAGG…

G

Page 9: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Coverage profile and variant calling

Removing duplicates and non-unique mappings

Mapping

Demultiplexing

Variant filtering

Reference Genome

…ACTTCGTCGAAAGG…

Reference Genome

…ACTTCGTCGAAAGG…

Frequency >= 20%

Coverage >= 5

Page 10: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Variant calling

Removing duplicates and non-unique mappings

Mapping

Demultiplexing

Variant filtering

Genes and known variants

Reference Genome

…ACTTCGTCGAAATG… …GTCCCGTGATACTCCGT…

GA

rs230985Gene X

Page 11: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Resequencing results

Page 12: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Working with IGV:// . . / /http www broadinstitute org igv

Page 13: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Integrative Genome Viewer

IGV is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.

Page 14: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Genome used for mapping

Name of sample (BAM file)

Lowest resolution of the genome (zoomed out)

Page 15: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Zooming in

Page 16: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Zooming inCoverage track

Alignment track

Page 17: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Zoomed in until we get to the base pair value

SNP

Page 18: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Hover over the coverage track in order to see details regarding all bases in a specific position

Can we trust this SNP?

Page 19: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Hover over the alignment track in order to see details regarding a specific read

What is the quality of this read and its mapping?

Page 20: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Right-click on alignment track to change view of this track

Page 21: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Color reads by strand to verify there is no strand bias

Page 22: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Why and how to work with IGV

Page 23: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Base qualities, comparison between samples

Page 24: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

False positive indels

Page 25: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Same mapping statistics – different meaning

What might cause this low percentage of mapping?

Page 26: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

The sample contains a high percentage of contamination

The sample is very different from the reference genome

Page 27: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

One image is worth a thousand words…

Page 28: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Structural Variations

Large deletion in the sample compared to the reference genome

Page 29: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Galaxy

Page 31: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Use your account name and password to login to Galaxy:

Page 32: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Uploading data to Galaxy

Page 33: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up
Page 34: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up
Page 35: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up
Page 36: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up
Page 37: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Use the “eye” icon to view the contents of a file

Page 38: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Mapping, filtering and conversion to BAM

Page 39: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Mapping

Page 40: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Filter SAM file

Page 41: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Convert SAM to BAM

Page 42: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Variant calling

Page 43: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Create pileup

Page 44: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Find variants

Page 45: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

Tuning up the pipelines

Page 46: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

1 mismatch per read

5 mismatches per read

How can mapping parameters affect the results

Page 47: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

False positives vs. true negatives

3-bases insertion

One pipeline for all projects?

Page 48: Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up

How can you tune your analysis?Try different programs.

Mapping:– Change mapping parameters– Use non-unique mappings– Don’t filter duplicates

Variants:– Change variant filtration – Change variant merging – penetrance, different heredity, low coverage in

one individual…– Look for bigger variants: big insertions/ deletions, inversions, copy number

variations etc.