megan milton & mallory van wyngaarden - managing barcode data library generation
DESCRIPTION
How to manage barcode data library generation using BOLD systemsTRANSCRIPT
Barcode of Life Data Systems (BOLD)
www.boldsystems.org (v2.5)v3.boldsystems.org (v3.0 beta)
Managing Barcode Data Library Generation
Fourth International Barcode of Life Conference - Workshop
Megan Milton and Mallory Van Wyngaarden
Monday, November 28, 2011 – University of Adelaide, Australia
Barcode Library Generation
Barcode Library Generation
Needs• Scope (taxonomic and/or geographic)• Barcode standards compliance• Completion of data• Access by all participants• Quality control process• Data Curation/updates• Avoid duplication of effort• Computational power for analysis• Protection of data
BOLD Workbench
How BOLD addresses these needs:• Secure Data Storage• Online access anywhere• Permission based sharing• Taxonomy Browser (view progress so far)• Built-in Quality Control checks• Progress feeds/Activity log• Analysis tools on BOLD compute cluster
User Registration
Getting Started
Requesting an Account– Requirements:
• Valid Email Address
• Institutional Affiliation• Password
Getting Started
Creating a Project– Project Identifiers
• Project code• Project type
– Markers• Primary• secondary
– Campaign– Description– Project permissions
Project Creation Form
Specimen Page Sequence Page
Getting Started
Barcode Record = Specimen data + Molecular data
Getting Started
Standard Workflow - order of upload
Specimen Data
Images
Traces
Sequences
Specimen Data Submissions
Single Specimen Upload Form
Specimen Data– Single Uploads
• Identifiers• Taxonomy• Specimen Details• Collection data
– Batch Uploads• New and updated records• Template spreadsheet• Submit through BOLD to
Data Management Team
Image Submissions
Image Library
Image Data– Required Fields
• Sample ID• Process ID• Image File• Original Specimen• View Metadata• Licensing
– Resolution• < 20 Megapixels
– Assemble Package• Images (.jpeg format)• Spreadsheet (template)• Maximum zipped file size
190MB
Trace Submissions
Trace File Viewer
Trace Files– Sequencing details:
• Trace file in .ab1 or .scf• Phred File in .phd.1• PCR primers• Sequencing primer• Direction• Marker• Attribution to run site
– Assemble Package• Electropherograms• Spreadsheet (template)• Maximum zipped file size
190MB
Primer Submissions
Primer Database
Primer Database– Search by
• Primer code• Submitter• Target marker• Reference/Citation
Primer Submissions
Primer Submission Form
Primers– Required Fields
• Primer code• Primer description• Target marker• Primer sequence• Reference/Citation• Direction• *Public/private
Sequence Submissions
Sequence Page
Sequence Data– Required Fields
• Aligned sequences in FASTA format
• Header can use Process ID or Sample ID
• Marker• Run Site (Institution)• < 1000 sequence per upload
Project Console– Project Permissions and
Publication• Project manager only
– Project Statistics– Upload/Downloads– Sequence Analysis– Specimen Aggregates– Activity Feed– Tags and Comments
Project Console
Project Summary
Record List and Icons
Project Summary
Record List– Identification– Specimen Page
• Specimen information• Image data
– Sequence Page• Sequence(s), trace files and
primer
– Icons and flags– Tagging and Comments
on multiple records
Taxon ID Tree
Data Validation
Taxon ID Tree– Requires: good quality
sequences, some level of taxonomy, images are recommended
– Highlights common contaminations
– Colourize by taxonomy, geography, etc
– Helps to catch misidentifications
– Add pictures for comparison– Use to help make
identifications
Nearest Neighbour Summary
Data Validation
Nearest Neighbour– Tabular Format– Requires low level taxonomy– Highlights:
• Low Divergence compared to nearest neighbour
• Divergence that is less than the intra-specific
Specimen and Sequence Pages
Data Curation
Editing Records– Review graphs and flags
in Project Summary– Review and edit
specimen page– Review sequence page
• Sequence• Trace• Primer
– Replace or delete images, traces, sequences
Publication
Publishing Project– Submitting to GenBank– Making projects public
on BOLD
Published Project
Bibliography Submissions
Biblio Submission Form and Publication Database
Bibliography• Required Fields:
• Title• Authors• Abstract• Journal details
• Connect to BOLD records• Primary records• Secondary records