genboree microbiome workbench 16s workshop part i march 11 th, 2014 julia cope emily hollister kevin...

22
Genboree Microbiome Workbench 16S Workshop Part I March 11 th , 2014 Julia Cope Emily Hollister Kevin Riehle

Upload: damian-pickerel

Post on 02-Apr-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Genboree Microbiome Workbench 16S Workshop Part I

March 11th, 2014Julia Cope

Emily HollisterKevin Riehle

Page 2: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Genboree 16S Workshop

• Learning Objectives– Students should be able to take .sff files and user

supplied information and produce:• Metadata File• PCoA• Classification Distribution

• Expectations– Apply topics learned today before next meeting– Be able to discuss where issues arise– Be able to move knowledgeably through the whole

Genboree Workflow

Page 3: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Genboree 16S Workshop Part II

• Learning Outcomes– Newer database version of RDP – How to take advantage?– Students should take user .sff files and user created

metadata file and produce: (I can provide files if needed.)• PCoA (QIIME)• Classification Distribution (RDP)

• Expectations– Apply topics learned in tutorial– Be able to discuss where in the process issues arose – Have a hypothesis about your data issues if they happen

Page 4: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Workshop Outline

• 16S• Metadata File• Genboree Workbench Workflow

– Account– Group– Database– Project– Loading your files/samples/sequences (and linking)– QIIME– RDP– How to get help

• Wrap Up and Preparation for 2nd Installment

Page 5: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Resources

• Genboree Home Screen– http://genboree.org

• Tutorials are located in the Genboree Commons– You must be signed in to open the following link– http://genboree.org/theCommons/projects/mw-march-201

4– Tutorial 1 Data Set:

• http://www.genboree.org/microbiome/include/data/tutorial_sequence_file.sff.gz

– Tutorial 2 Data Set:• http://genboree.org/theCommons/attachments/3545/Tutorial_2

.zip

• Projects are accessed through the Genboree Workbench

Page 6: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle
Page 7: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

16S

• What is it?• What part is being sequenced?– Here?– Elsewhere?

• How is this accomplished?– DNA to bead to light– Intro. to flow data and .sff file content– OUTPUT is an .sff file– Aside on zipping methods and large file transfers

Page 8: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Allmetrics.net Sales Material

Tortoli E Clin. Microbiol. Rev. 2003;16:319-354

• What is it? 16Svedberg (small sub-unit of the ribosome)

• What part is being sequenced?

Here? - TCMC sequences the V5-V3 by 454Elsewhere? - V3-V5, V1-V3, V9, V7-V9…many more.Know your variable regions

16S

Page 9: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

16S

• How is this accomplished?– DNA to bead to light

http://cage.unl.edu/equipmentsoftware.shtml454 Life Sciences Sales Materials

Page 10: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

16S

• How is this accomplished?– DNA to bead to light

http://cage.unl.edu/equipmentsoftware.shtml454 Life Sciences Sales Materials

Page 11: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

16S

• How is this accomplished?– DNA to bead to light– Intro to flow data and sff file content– OUTPUT is an .sff file– Standard Flowgram Format

• All reads are structured as linker-tag-primer • Provides both identity and quality information

http://cage.unl.edu/equipmentsoftware.shtmlAllmetrics.net Sales Material

Page 12: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Genboree Workflow

• Take one step back from the Genboree Workflow and talk about input files.

• What do you do with your files?

From: Genboree.org help files

Meta-data

.sff

Page 13: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Genboree Workflow

• What do you do with many files?• Genboree takes .zip, .gzip, .txt, and .sff files– Compressed files are easier and faster to move– Multiple files are easier to move when compressed together

in an archive

Meta-data

.sff

.sff.sff

.sff

.sff

.sff.sff(s) should be

archived and compressed.

Meta data files are very small and do not

need compression.Meta-data

Page 14: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Metadata Files

• What data must you have?• How should it be formatted for Genboree?• What can you include?• How to make it tab-delimited• Include variable region or primer?• Directional awareness on primers

Page 15: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Metadata Files

• What data must you have?– name– barcode– region or proximal & distal– First column must begin with #– #No_spaces_are_allowed_in_column_names_0123456789

• How should it be formatted for Genboree?– Tab delimited

• What can you include?• How to make it tab-delimited?• Include variable region or primer?• Directional awareness on primers

Page 16: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Metadata Files

• How to determine which to include - variable region or primers

• Directional awareness on primers• Demo of making and saving as tab delimited

#name barcode proximal distal region body_siteS_700033665 CCGTTCCTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolS_700035861 ACCGGCGTTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolS_700095543 ACGAATTAAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolS_700095850 AACCGGATAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolS_700101600 AACGGAACGC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolT_700016994 AATAACCGTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 ThroatT_700095565 TTAATGGAAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 ThroatT_700095872 CGGACCGGAAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 ThroatT_700101388 CCGAACGAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 ThroatT_700101622 TTCGTTCTTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 Throat

or

Page 17: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

#name barcode proximal distal region body_siteS_700033665 CCGTTCCTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolS_700035861 ACCGGCGTTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolS_700095543 ACGAATTAAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolS_700095850 AACCGGATAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolS_700101600 AACGGAACGC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolT_700016994 AATAACCGTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 ThroatT_700095565 TTAATGGAAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 ThroatT_700095872 CGGACCGGAAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 ThroatT_700101388 CCGAACGAC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 ThroatT_700101622 TTCGTTCTTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 Throat

Metadata Files - Demo

• Select the data above and Copy.• Paste into Excel or an open source spreadsheet program. Be sure all

entries are free of spaces and special characters and that all samples have the same number of columns. Avoid the column titles "state" and "type".

• Save As and select tab-delimited.• Name your file in a clear and consistent manner.

or

Page 18: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Metadata Files

• How to determine variable region vs. primer inclusion• Directional awareness of primers• If you aren’t sure, ask!• What are these files often called: mapping, metadata,

oligos, or linker-primer file. (Many others possible.)#name barcode proximal distal region body_siteS_700033665 CCGTTCCTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 StoolS_700035861 ACCGGCGTTC CCGTCAATTCMTTTRAGT CTGCTGCCTCCCGTAGG V3V5 Stool

Allmetrics.net Sales Material

Page 19: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Metadata Files

• Another example: Tutorial Set 2 Metadata• What possible issues may arise with this metadata

file?sampleName tag proximal distal region sample_period typeFerm_5 AGCTTCGA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 5 FermentationFerm_2 GCCATACATT GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 2 FermentationFerm_3 GCCAGCAAGT GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 3 FermentationFerm_4 CGTTAAGA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 4 FermentationFerm_1 CTAACAGA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 1 FermentationSoil_1 ACGCAAAA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 1 SoilSoil_2 CTAACTAA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 2 SoilSoil_3 GCGACCTAGT GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 3 SoilSoil_4 AAGAATCA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 4 SoilSoil_5 AGCGCAGA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 5 Soil

Page 20: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Metadata Files• Another example• What possible issues may arise with this metadata file?• Change name => #name (or any #1st entry)• Change tag => barcode• Change type => sample_type (do not name columns ‘type’ or ‘state’)• Demo. making and saving as tab-delimited

#name barcode proximal distal region sample_period sample_typeFerm_5 AGCTTCGA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 5 FermentationFerm_2 GCCATACATT GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 2 FermentationFerm_3 GCCAGCAAGT GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 3 FermentationFerm_4 CGTTAAGA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 4 FermentationFerm_1 CTAACAGA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 1 FermentationSoil_1 ACGCAAAA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 1 SoilSoil_2 CTAACTAA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 2 SoilSoil_3 GCGACCTAGT GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 3 SoilSoil_4 AAGAATCA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 4 SoilSoil_5 AGCGCAGA GAGTTTGATCNTGGCTCAG CAGCMGCCGCNGTAANAC V1V3 5 Soil

Page 21: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

7zip

• Zipping methods and large file transfers• Compression and archiving of files• Uncompressing in an easy to use format for

PCs• Demo compressing– .sff (s) – http://www.7-zip.org/

From: 7-zip.org

Page 22: Genboree Microbiome Workbench 16S Workshop Part I March 11 th, 2014 Julia Cope Emily Hollister Kevin Riehle

Genboree Workflow• Create Group• Create Database• Create Project• Upload Files • Create Samples (Sample Import using metadata file) • Link Samples to Sequence Files (Sample File Linker) • QC and Attach Sequences (Sequence Import) • QIIME • RDP