co-op presentation

14
Co-OP Presentation My contribution to the Genome Sciences Centre September 2015 –> April 2016

Upload: elijah-willie

Post on 14-Apr-2017

60 views

Category:

Documents


0 download

TRANSCRIPT

Co-OP PresentationMy contribution to the Genome Sciences Centre

September 2015 –> April 2016

OVERVIEW

• Pipelines• Projects• Validation(s)• ChimeraScan• Trinity• Manta• Development• Additional Work• What I learned• What I can improve• Moving forward• Acknowledgments

Pipelines• ABySS: Assemble short reads by a de novo, parallel, paired-end sequence assembler• Trans-ABySS: Analyze assemblies for structural variants and splice variants

using a reference genome and annotations.• Genome-Validator: Validate fusion and indel events from Trans-ABySS

against given BAM files and attempt to assigning ‘tumourigenicity’ as ‘somatic’ or ‘germline’ to events when both a normal tumour genome are given.

• Delly: Discover split-read and paired-end structural variants and genotyping from parallel sequencing data.

• Microbial Detection Pipeline: Detect bacterial and/or viral sequences to determine potential contamination or integration into the genome.

• Integration Site Pipeline: Detect putative integrative sites of viral sequences into human sequences.

• Probing Pipeline: Detect fusion and SNP mutations in genome and transcriptome libraries.

• Compression and Transfer: Compress and transfer files off of scratch space for archiving and reducing total space usage on scratch space.

Projects

• TCGA LIHC• TCGA MESO• NCI HER2 BRCA• GPH Lymphoma• TCGA BLCA• TCGA SARC• WES CHOL• TCGA UVM• COLO-829

• Kaplan• HCI HIV Cervical• MCF7• TCGA THYM

Validations

• ChimeraScan-0.4.5• hg38 Annotations• Trinity(Partially)• Manta

ChimeraScan-0.4.5A software package that detects gene fusions in paired-end RNA sequencing (RNA-Seq) datasets. differs from other fusion finders(deFUSE) in that it adds a fragmentation step along with the whole paired-end approach which is also used by deFUSE.

Script(s):• setup:

– /projects/trans_scratch/software/chimerascan/scripts/chimerascan_setup_final.sh

• checker:– /projects/trans_scratch/software/chimerascan/scripts/chimerascan_checker.sh

• cleaner:– /projects/trans_scratch/software/chimerascan/scripts/chimerascan_cleaner.sh

• binner:– /projects/trans_scratch/software/chimerascan/scripts/binning_beta.py

• summarizer:– /projects/trans_scratch/software/chimerascan/scripts/chimerascan.sum.sh

• report generator:– /projects/trans_scratch/software/chimerascan/scripts/ChimeraScan.report.sh

MantaRapid detection of structural variants and indels for clinical sequencing applications

Script(s):• manta_sum.sh:

– /home/ewillie/tools/scripts/manta_sum.sh

• manta_delly_overlay.py:– /home/ewillie/tools/scripts/manta_delly_overlay.py

• Manta_gv2_overlay.py:– /home/ewillie/tools/scripts/manta_gv2_overlay.py

• vcfToBedpe:– /projects/trans_scratch/software/svtools-Manta2Bedpe/vcfToBedpe

DevelopmentOverlay/Setup Script(s):

• manta_delly_overlay.py:– /home/ewillie/tools/scripts/manta_delly_overlay.py

• Manta_gv2_overlay.py:– /home/ewillie/tools/scripts/manta_gv2_overlay.py

• transabyss_defuse_overlay.py:– /home/ewillie/tools/scripts/transabyss_defuse_overlay.py

• trinity_setup.sh:– /projects/trans_scratch/trinityrnaseq-2.1.1/trinity_setup.sh

Additional Work• Assemblies: Run ABySS to assemble sample(s) for further downstream analyzing.• Analyses: Run various analysis tools on data and comparing their result by means of

overlays and/or visualization.• Overlays: Compare results between different tools or different settings to find

similarities and differences. The overlays are done using appropriate scripts, and venn diagrams are generated to help illustrate similarities and/or differences.

• Testing Scripts: new scirpts such as integration_pipeline.sh were tested for potential bugs and ease of use. Testing was done iteratively, with each iteration providing more confidence.

• ChimeraScan Wiki: Create a comprehensive wiki with information regarding validation, and a detailed procedure for running the tool. Additional information such as installation procedure, resource requirements, and interpreting the outputs. The wiki also contains debugging information.

ChimeraScan Wiki

What I Learned

• Real world applications of bioinformatics.• Problem solving including troubleshooting, debugging and querying the

literature.• Bash scripting language including a significant knowledge of terminal

commands.• Writing scripts to improve time and efficiency of jobs.(Do a job manually

for > 2hrs or write a script to do it in a fraction of that time.)• A greater attention to detail to help reduce rate of errors.• Time management, task prioritization and meeting deadlines.• Visualize and analyzing structural variants using IGV.

What I could work onProblem solving and troubleshooting skills.Deeper understanding of the SVIA pipeline tools.Clear and concise presentation of my results.Minimizing my rate of error when performing tasks.Verbal presentation skills.Create an appetite for personal projects.

ANY SUGGESTIONS????????

Moving ForwardMy interest in the algorithmic aspect of genomics has grown tremendously,

enticing me to take more applied algorithm courses. Obtaining a genomics certificate as part of my degree to further develop my

interest in genomic sciences. Since i am now aware of the qualities and skills that are needed to be successful

in this rapidly changing industry, I will be dedicating time to further develop these qualities and sharpen these skills.

Improving my scripting abilities both in python and bash to build on the experience I have already gained here during the last eight months.

Applying the knowledge and skills i have acquired here in order to be successful in a different work environment.

Acknowledgement Karen MungallYussanne MaCaleb ChooCaralyn Reisle Dustin BleileMelika BonakdarStuart ZongDiana PalmquistGordon robertsonSerena ChanKaren Eddy