variation data in vectorbase

17
November 2007 BRC5 Bethesda Variation data in VectorBase Dan Lawson, VectorBase EMBL-EBI

Upload: misha

Post on 18-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Variation data in VectorBase. Dan Lawson, VectorBase EMBL-EBI. Variation database. Use Ensembl Variation database schema Ancilliary database to ‘core’ Perl API for programmatic access Biomart implementation for data mining Align reads to reference genome and call SNPs. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Variation data in VectorBase

November 2007 BRC5 Bethesda

Variation data in VectorBase

Dan Lawson,

VectorBase EMBL-EBI

Page 2: Variation data in VectorBase

November 2007 BRC5 Bethesda

Variation database

» Use Ensembl Variation database schema

» Ancilliary database to ‘core’

» Perl API for programmatic access

» Biomart implementation for data mining

» Align reads to reference genome and call SNPs

Page 3: Variation data in VectorBase

November 2007 BRC5 Bethesda

Showing SNPs in ContigView

Page 4: Variation data in VectorBase

November 2007 BRC5 Bethesda

Showing SNPs in ContigView

Page 5: Variation data in VectorBase

November 2007 BRC5 Bethesda

SNP Report

Page 6: Variation data in VectorBase

November 2007 BRC5 Bethesda

SNP Report - SNP context

Page 7: Variation data in VectorBase

November 2007 BRC5 Bethesda

TranscriptSNPview

Page 8: Variation data in VectorBase

November 2007 BRC5 Bethesda

Page 9: Variation data in VectorBase

November 2007 BRC5 Bethesda

GeneSNPview

Page 10: Variation data in VectorBase

November 2007 BRC5 Bethesda

Navigation through SNP pages

Page 11: Variation data in VectorBase

November 2007 BRC5 Bethesda

Page 12: Variation data in VectorBase

November 2007 BRC5 Bethesda

Page 13: Variation data in VectorBase

November 2007 BRC5 Bethesda

Alignment of strain sequences

Page 14: Variation data in VectorBase

November 2007 BRC5 Bethesda

Tabulated SNP details

Page 15: Variation data in VectorBase

November 2007 BRC5 Bethesda

More Anopheles gambiae genomes

» Sequencing of A. gambiae M & S forms complete

» M form (WashU GSC) 2,754,999 reads

» S form (JCVI) 2,714,217 reads

» Planned sequencing for 12 genome Anopheles cluster

Page 16: Variation data in VectorBase

November 2007 BRC5 Bethesda

SNP calling in A. gambiae S form

» Data from Ewen Kirkness (JCVI)

» 2.1 million potential SNPs

Page 17: Variation data in VectorBase

November 2007 BRC5 Bethesda

Summary

» (Re)-use of well established data structure

» Extensive set of visualization tools

» Data mining via BioMart tool

» Programmatic access through Ensembl

» Ability to handle re-sequencing data (including new technologies)