Tutorial 6
High Throughput Sequencing
HTS tools and analysis
• Visualization - IGV
• Analysis platform – Galaxy
• Tuning up the pipelines
:// . . / /http www broadinstitute org igv
Why and how to work with IGV
Base qualities, comparison between samples
Same mapping statistics – different meaning
What might cause this low percentage of mapping?
The sample contains a high percentage of contamination
The sample is very different from the reference genome
One image is worth a thousand words…
Structural Variations
Large deletion in the sample compared to the reference genome
:// . 2. . .https main g bx psu edu
/
Use your account name and password to login to Galaxy:
Mapping, filtering and conversion to BAM
1 mismatch per read
5 mismatches per read
How can mapping parameters affect the results
False positives vs. true negatives
3-bases insertion
One pipeline for all projects?
How can you tune your analysis?Try different programs.
Mapping:– Change mapping parameters– Use non-unique mappings– Don’t filter duplicates
Variants:– Change variant filtration – Change variant merging – penetrance, different heredity, low coverage in one
individual…– Look for bigger variants: big insertions/ deletions, inversions, copy number
variations etc.
Gene expression:– Change the test threshold