![Page 1: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/1.jpg)
C3BI
VARIANTS CALLINGNovember 2016Pierre LechatStéphane Descorps-Declère
![Page 2: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/2.jpg)
General Workflow (GATK)
![Page 3: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/3.jpg)
software websites
software website
bwa http://bio-bwa.sourceforge.net/
picard http://picard.sourceforge.net/
samtools http://samtools.sourceforge.net/
GATK http://www.broadinstitute.org/gatk/
IGV http://software.broadinstitute.org/software/igv/
tablet http://bioinf.scri.ac.uk/tablet/
vcftools http://vcftools.sourceforge.net/
![Page 4: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/4.jpg)
General Workflow (GATK)
![Page 5: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/5.jpg)
Raw Sequence Data Format
• FASTQ format
• Phred quality score!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRS
0………………………………………………………………………………………….50
1……………………………………………………………………………………..0.00001
Phred score
Error rate
Phred score = -10 * log10P
Sequence ID
Sequence
Quality score
![Page 6: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/6.jpg)
Quality Score
• Q-Score = Quality Table(Quality Predictors)
![Page 7: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/7.jpg)
How do my newly obtained data look?
Check for overall data quality. FastQC is a great tool that enables the quality assessment.
Good quality! Poor quality!
Quality Checks
![Page 8: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/8.jpg)
What do I do when FastQC calls my data poor?
Poor quality at the ends can be remedied
“quality trimmers” like trimmomatic, fastx-toolkit, etc.
Left-over adapter sequences in the reads can be removed
“adapter trimmers” like trimmomatic.
Always trim adapters as a matter of routine
Once the trimmers have been used, it is best to rerun the data through FastQC to check the resulting data
Quality Checks
![Page 9: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/9.jpg)
Before quality trimming After quality trimming
Quality Checks
![Page 10: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/10.jpg)
QC and Mapping
You don’t need to do quality trimming withbwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, itwill be (soft) clipped.
Heng Li.
It is still recommended to trim adapter sequences. After all, adapters are not part of the samples you are sequencing. They mightaffect variant calling in corner cases.
Heng Li.
![Page 11: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/11.jpg)
General Workflow
![Page 12: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/12.jpg)
Short Read Alignment
Sequencing machine
And you get MILLIONS of them !
![Page 13: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/13.jpg)
Short Read Mapping
Need to map them back to the reference chromosomes
13
![Page 14: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/14.jpg)
Mapping
14
CTGACCTCATGTGATCCACCCGCCTTGGCC
TGATCCAC
Find best match for the read in a reference sequence
Reference sequence
(a read of length 8 bases)
Challenges:Errors in readsErrors in librariesRepetitive regions (repeats, homologous regions)HomopolymersIndividual polymorphisms
![Page 15: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/15.jpg)
Different Mapping Algorithms
• BWA – 2009• BWA-SW – 2010• BWA-MEM – 2013• Bowtie – 2009• Bowtie2 – 2012• Gem – 2012• Cushaw2 – 2014• Novoalign Li, arXiv:1303.3997 (2013)
Further reading: “A survey of sequence alignment algorithms for next-generation sequencing”Li H. and Homer N. 2010. Briefing In Bioinformatics
![Page 16: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/16.jpg)
SAM/BAM Format
• SAM (Sequence Alignment/Map) format:– Single unified format for storing read alignments
to a reference genome
• BAM (Binary Alignment/Map) format:– Binary equivalent of SAM– Advantages
– Supports indexing– Compact size
![Page 17: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/17.jpg)
BAM Improvement
• Remove duplicates• Local realignment• Base quality
recalibration
Improvement
![Page 18: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/18.jpg)
raw reads
reference genome
low MQ: the probability of mapping to different locations is high, but no perfect multiple matches
high MQ: a single match
MQ0: a perfect multiple match
What if there are several possible places to align your sequencing read?This may be due to:- Repeated elements in the genome- Low complexity sequences- Reference errors and gaps
MQ is a phred-score of the quality of the alignment
Mapping Quality (MQ)
![Page 19: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/19.jpg)
Library Duplicates
• All second-gen sequencing platforms are NOT single molecule sequencing– PCR amplification step in library preparation can
result in duplicate DNA fragments in the final library prep.
(PCR-free protocols do exist – require large volumes of input DNA).
• Can result in false SNP calls– Duplicates manifest themselves as high read
depth support
![Page 20: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/20.jpg)
Duplicates and False SNP Calls
![Page 21: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/21.jpg)
Remove Duplicates
• Identify read-pairs where the outer ends map to the same position on the genome and remove all but 1 copy :
– Samtools: (samtools rmdup || samtools rmdupse)– Picard/GATK: MarkDuplicates
![Page 22: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/22.jpg)
Local Realignment - indels
• The trouble with mapping approaches
![Page 23: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/23.jpg)
Local Realignment - indels
• The trouble with mapping approaches
![Page 24: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/24.jpg)
Local Realignment - indels
• The trouble with mapping approaches
![Page 25: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/25.jpg)
Local realignment in GATK
• Uses information from known SNPs/indels(dbSNP, 1000 Genomes)
• Uses information from other reads • Smith-Waterman exhaustive alignment on
select reads
![Page 26: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/26.jpg)
The quality of a call depends on multiple factors (e.g. position in the read, sequence context). In addition, the alignment can provide useful information. Mismatches to the reference are considered errors (unless they are described polymoprhisms).
It supports several platforms: Illumina, SOLiD, 454, Complete Genomics, Pacific Biosciences (stated on the website) and IonTorrent (stated in the GATK forum).
It combines all the available information to re-evaluate the probability of a wrong call at each position in each read.
It requires a catalogue of variable sites!
We will not run it but you can find how to do it at http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_bqsr_BaseRecalibrator.html
Base Quality Recalibration
![Page 27: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/27.jpg)
Base Quality Recalibration
More information : http://zenfractal.com/2014/01/25/bqsr/
![Page 28: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/28.jpg)
General Workflow (GATK)
![Page 29: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/29.jpg)
Variant Calling
• SNP Calling• Short Indels• Structural Variants
• A variant call is a conclusion : there is a nucleotidedifference vs. some reference at a given position in an individual genome or transcriptome.
• Sometime accompanied by an estimate of variant frequency and some measure of confidence
![Page 30: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/30.jpg)
Structural Variants (SVs)
Structural variant is the umbrella term to encompass a group of genomic alterations involving segments of DNA typically larger than 1 kb.
The structural variation may be
•Quantitative (CNVs – indels and duplications)
•Positional (translocations)
•Orientational (inversions).
![Page 31: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/31.jpg)
Reads
Zero level
Read count
Reference
Genome
Reference
Read
Deletion
3. Split read 2. Read depth
Mapping
Reference
Genome
Deletion
Mapping
1. Paired endsReference
Genome
Mapping
Reference
Sequenced paired-ends
Deletion
Structural Variants Detection
![Page 32: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/32.jpg)
Sequence Read Depth Analysis
Individual sequence
Zero level
32
Read depth signal
Reads
Mapping
Reference genome
Counting mapped reads
![Page 33: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/33.jpg)
CNVnator on RD data
NA12878, Solexa 36 bp paired reads, ~28x coverage
![Page 34: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/34.jpg)
Paired-ends methods
34
![Page 35: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/35.jpg)
PEM(2)
Deletion? Insertion?
Deletion? Insertion?
![Page 36: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/36.jpg)
Different classes of SVs
Nature Reviews Genetics 12, 363-376 (May 2011) | doi:10.1038/nrg2958
![Page 37: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/37.jpg)
Bioinformatics. 2010 Aug 1; 26(15): 1895–1896.
![Page 38: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/38.jpg)
SNP Calling• SNP – single nucleotide polymorphisms
– Examine the bases aligned to position and look for differences
• Factors to consider when calling SNPs– Base call qualities of each supporting base– Proximity to small indels– Mapping qualities of the reads
supporting the SNP– Read length– Paired reads– Sequencing depth– Cluster of SNPs
![Page 39: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/39.jpg)
Example SNP
http://www.sanger.ac.uk/mousegenomes
![Page 40: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/40.jpg)
Short Indel Calling
• Small insertions and deletions observed in the alignment of the read relative to the reference genome
• Factors to consider when calling indels– Misalignment of the read– Homopolymer runs either side of the indel
• AAAA or TTTTTTTT
– Length of the reads
![Page 41: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/41.jpg)
Example Indel
http://www.sanger.ac.uk/mousegenomes
![Page 42: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/42.jpg)
How the HaplotypeCaller works ?
![Page 43: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/43.jpg)
General Workflow (GATK)
![Page 44: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/44.jpg)
Pileup format
![Page 45: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/45.jpg)
Pileup format
![Page 46: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/46.jpg)
Exemple
![Page 47: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/47.jpg)
Variant Call Format (VCF)
• VCF is a standardized format for storing DNA polymorphism data– SNPs, insertions, deletions and structural variants– With rich annotations
• Indexed for fast data retrieval of variants from a range of positions
• Store variant information across many samples• Record meta-data about the site
– dbSNP accession, filter status, validation status,• Very flexible format
![Page 48: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/48.jpg)
Headerlines starting with ##: arbitrary number of meta-information linesline starting with #: column definition – mandatory columns include:
CHROM chromosomePOS position of the start of the variantID unique identifier of the variant (e.g. rs number for SNPs)REF reference alleleALT comma separated list of alternate non-reference allelesQUAL phred-scaled quality scoreFILTER site filtering informationINFO user extensible annotation (e.g. samtools and GATK may differ in this)
Dataone line per site (all columns described above per line); useful information per site and per sample
Variant Call Format (VCF)
![Page 49: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/49.jpg)
Variant Call Format (VCF)
http://vcftools.sourceforge.net/specs.html
Exemple:
![Page 50: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/50.jpg)
SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants (suchas amino acid changes).
SnpEff : annotation of variants
A typical SnpEff use case would be:
• Input: The inputs are predicted variants (SNPs, insertions, deletionsand MNPs). The input file is usually obtained as a result of a sequencing experiment, and it is usually in variant call format (VCF).
• Output: SnpEff analyzes the input variants. It annotates the variantsand calculates the effects they produce on known genes (e.g. aminoacid changes).
![Page 51: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/51.jpg)
Common cautions (*):- Base quality BQ20- Depth (min and max) very dependent on your average- Mapping quality MQ50/60- Strand-bias- SNP density dependent on the genome [e.g. no more than 1 SNP/10 bp]- Indel proximity not closer than 10bp to an indel
(*) Some filters may be applied during the variant calling while others are applied afterwards
Further reading: “Consensus Rules in Variant Detection from Next-Generation Sequencing Data” Jia et al 2012 PLoS One
Filtering SNP Rules
Keep in mind your project may have some specific requirements !
![Page 52: C3BI - Institut Pasteur...Y ou don’t need to do quality trimming with bwa-mem. […] Bwa-mem largely does local alignment. If a tail cannot be mapped well, it will be (soft) clipped](https://reader034.vdocument.in/reader034/viewer/2022051805/5ff8e3fcebb2ac631e59944f/html5/thumbnails/52.jpg)
What to do if I don’t have a validreference ?