the genome consortium on active teaching using next- generation sequencing (gcat-seek) genomics and...

1
The Genome Consortium on Active Teaching Using Next-Generation Sequencing (GCAT-SEEK) Genomics and bioinformatics are dynamic fields that provide opportunities to form student-scientist partnerships at small liberal arts colleges. Empowering undergraduate faculty with access to state-of-the-art technology and with tools to implement curricular changes is a difficult and evolving challenge. This challenge has been successfully addressed in the last decade by the Genome Consortium on Active Teaching (GCAT), a grass-roots consortium of undergraduate educators. GCAT provided undergraduates access to microarray technology, and has impacted over 300 faculty and 24,000 undergraduates. A major driving factor that enticed a diverse group of faculty to adjust their teaching strategies was the academic freedom associated with integrating their own research questions into an active teaching approach. A new network of educators (GCAT-SEEK) was formed in July, 2011 to enable undergraduate access to Next-Generation sequencing and functional genomics using the GCAT organizational model. The consortium now involves over 100 faculty, postdocs, and students from over 80 institutions throughout the country. Major interest areas include genomics, transcriptomics, and metagenomics. GCAT-SEEK aims to engage students in inquiry-based learning that is grounded in the key concepts and competencies of modern biology, and are connected to learning objectives and assessments. In our first year we have identified three bottlenecks that make it difficult to seamlessly integrate next-generation sequencing into undergraduate courses and research experiences. Challenges include experimental design for the faculty member who is a novice with respect to the technology, bioinformatics training of faculty, and identification of appropriate and effective pedagogical and assessment tools. The vision of GCAT-SEEK is for faculty at primarily undergraduate institutions to direct their innate passion for research into projects of their choosing that become the cornerstone of innovative, broadly disseminated educational efforts that are assessed for student learning gains, and meet the goals of the “Vision and Change in Undergraduate Biology Education” dialogues published by AAAS and NSF. Anticipated Broad Impacts: This network will provide additional educational opportunities and resources for STEM education and improved opportunity for students to be prepared for graduate, technical and research careers. With 116 faculty members from 88 institutions already members of the GCAT-SEEK network, we anticipate impacting thousands of students via this project, with special focus on minority representation. Intellectual Merits of Network Community of enthusiastic biologists, with primary undergraduate teaching responsibilities Intellectual synergies on experimental design, bioinformatics approach, pedagogy and assessment Discounted runs, software Dissemination of data, pedagogic, assessment modules Outreach to Minority Serving Institutions Database of barcoded metagenomic primers Voice for student input: leadership training, presentations, participation Cross-disciplinary interactions Student Impact in Year 1: 28 research students, 95 students in labs Standard Operating Procedure As a result of faculty/student workshops, participants will be able to: 1. Design experiments using next-generation sequencing technologies 2. Prepare nucleic acid samples and assess quality 3. Sequence and analyze their samples 4. Teach modules that integrate next-generation sequencing research into the classroom, and 5. Assess student learning goals and track outcomes D ay Setting Them e C ontent 2 B reakout W etLab Sam ple Prep 3 B reakout B ioinform atics Assem bly 4 A M B reakout B ioinform atics A nnotation / Com parison 5 A M G roup Faculty Presentations Faculty teaching modules 5 PM G roup StudentPresentations Student presentations Proposed G CAT-SEEK w orkshop schedule and generalcontent. 4 PM G roup A ssessing Student Learning G ains C ustom izing and U sing the SA LG 1 PM G roup NextG en ExperimentalDesign Platform s Experim ental D esign Vince Buonaccorsi, Juniata College Jeff Newman, Lycoming College Nancy Trun, Duquesne University Tammy Tobin, Susquehanna University Deborah Grove, Penn State University Abstract 0 5 10 15 20 25 1 2 3 4 5 Frequency Low High Linux Proficiency 0 5 10 15 20 25 30 35 1 2 3 4 5 Frequency Low High PerlorPython Proficiency 0 5 10 15 20 25 30 0 1-5 6-10 11-15 16-20 21-25 Frequency Num berofNextGen Data Sets Analyzed Technology Expertise latively novice with respect to computer science or NextGen approaches Teaching Experience 0 2 4 6 8 10 12 0 1-5 6-10 11-15 16-20 21-25 26-30 31-40 Frequency Years Teaching Undergraduate Teaching Experience 0 5 10 15 20 25 30 1-5 6-10 11-15 Frequency Years in G CAT Yearsin GCAT 0 5 10 15 1 2 3 4 5 Frequency Low High Fam iliarity w ith A ssessm ent Literature Relatively experienced with respect to teaching Who is GCAT-SEEK? M SI 14% N on M SI 86% M SI Institions Eukaryotic Genom ics 26% Bacterial Genom ics 18% M etageno mics 20% Transcript om ics 36% NextGen Apps of interest Biochem / M ol Bio / Genetics 78% Bioinform atics 5% Evolution / Ecology 17% Field of Teacher/Scholars Animalia 41% Archaea 2% Bacteria 16% Fungi 13% Plantae 28% Kingdoms of interest 0 5 10 15 20 25 30 35 1-1000 1001-5000 5001-10000 10001-20000 20001-30000 Frequency Num berofStudents Num berofundergraduatesatschool 14% from Minority Serving Institutions Diverse organisms and applications of interest. Predominantly BMB/Genetics/Microbiology faculty from small PUIs Works Cited Vision and Change in Undergraduate Biology Education: Preliminary Reports of Conversations. July 2009.NSF-AAAS. www.visionandchange.org Acknowledgements NSF Award # DBI-1061893 HHMI award to Juniata College Juniata College: Kresge Fund, Biology Dept, Provost Examples of Student / Scientist Partnerships in Year 1 Large non-model Eukaryotic genomics Sequence G enom e A ssem ble G enom e C reate a C ustom B LA ST database ( G eneious)from the assem bly D ow nload,study candidate gene sets ( U niprot/G enbank/ UCSC G .B row ser) Identify contigs in novelgenom e w ith hom ology to candidate genom es ( tB lastn in G eneious) Literature Search Form ulate Specific Q uestion C ollaborators Students Identify FullC D S in novel genom e using the MAKER2 web annotation pipeline ExtractC oding Sequences using G alaxy/ A pollo A lign sequences,separate into clusters,generate a phylogenetic tree ( G eneious) C alculate K a/K s ratio to determ ine positive selection ( Selecton, K a/K s calculator ) Identify contigs in novelgenom e w ith hom ology to candidate genom es ( tB lastn in G eneious) W rite M S Large non-model Eukaryotic transcriptomics Bacterial genomics: Lycoming College D ow nload Exom e Trios from 1000 G enom es D B M ap againstH um an R efusing N extG EN e on G C AT-SEEK server Pick a single gene and research prognosis of individual( HUG O DB ) Presentw ith tw o other lab m ates thatpicked differentSN Ps from sam e individual: Prognosis A dvice U se N extG EN e view er to exam ine data Teacher Students Filterdifferences Errors M ode ofinheritance dbNSFP A llele fqs A G x A A A A C G A G C G M om Dad Child Human genomics: Putative Freshman Lab Conclusions Our standard operating protocol should facilitate growth in membership, faculty expertise, and student training. Network members have diverse interests, low NextGen and bioinformatic experience, but high teaching experience. Year 1 examples of genomics work illustrate relative ease of projects involving bacteria, collaboration with research intensive universities, and commercially supported software for novice users. A student’s comparative analysis of transcriptome assembly methods. Geneious outperformed other methods in a 454 FLX+ low coverage (3X) dataset. Pipeline successfully used by three students to explore targeted gene sets in the un- annotated Sebastes rockfish genome related to mate recognition and high speciation rates. A student’s phylogenetic comparison of six uncharacterized pheromone receptors in Sebastes rubrivinctus (Sru) to three previously sequenced fishes. Further analyses showed no evidence of positive selection, which may occur in genes important to rapid speciation rates in the genus. Isolate R N A/ Sequence Transcriptom e A ssem ble Transcriptom e Using G eneious,C LC B io, N extG EN e Student1 Pipeline successfully used by students to annotate bacterial genomes Venn Diagrams allowing correlation of metabolism and bacterial ecology Putative pipeline to find and interpret differences between an individual and human reference genome. Example of a screenshot and scenario of compound heterozygosity Sample prep and deNovo transcriptome assembly pipeline used by a student A student has successfully installed the Linux- based MAKER pipeline on the GCAT-SEEK server, which can be used by other network members, allowing whole genome annotations. The MAKER web annotation service can be used by novice students to learn the analysis. Annotation of a single scaffold in S. rubrivinctus focused on the TERF1 gene. Polymorphisms in this gene may help explain negligible senescence in Sebastes rockfishes

Upload: percival-barker

Post on 18-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Genome Consortium on Active Teaching Using Next- Generation Sequencing (GCAT-SEEK) Genomics and bioinformatics are dynamic fields that provide opportunities

The Genome Consortium on Active Teaching Using Next-

Generation Sequencing (GCAT-SEEK)

Genomics and bioinformatics are dynamic fields that provide opportunities to form student-scientist partnerships at small liberal arts colleges. Empowering undergraduate faculty with access to state-of-the-art technology and with tools to implement curricular changes is a difficult and evolving challenge. This challenge has been successfully addressed in the last decade by the Genome Consortium on Active Teaching (GCAT), a grass-roots consortium of undergraduate educators. GCAT provided undergraduates access to microarray technology, and has impacted over 300 faculty and 24,000 undergraduates. A major driving factor that enticed a diverse group of faculty to adjust their teaching strategies was the academic freedom associated with integrating their own research questions into an active teaching approach. A new network of educators (GCAT-SEEK) was formed in July, 2011 to enable undergraduate access to Next-Generation sequencing and functional genomics using the GCAT organizational model. The consortium now involves over 100 faculty, postdocs, and students from over 80 institutions throughout the country. Major interest areas include genomics, transcriptomics, and metagenomics. GCAT-SEEK aims to engage students in inquiry-based learning that is grounded in the key concepts and competencies of modern biology, and are connected to learning objectives and assessments. In our first year we have identified three bottlenecks that make it difficult to seamlessly integrate next-generation sequencing into undergraduate courses and research experiences. Challenges include experimental design for the faculty member who is a novice with respect to the technology, bioinformatics training of faculty, and identification of appropriate and effective pedagogical and assessment tools.

The vision of GCAT-SEEK is for faculty at primarily undergraduate institutions to direct their innate passion for research into projects of their choosing that become the cornerstone of innovative, broadly disseminated educational efforts that are assessed for student learning gains, and meet the goals of the “Vision and Change in Undergraduate Biology Education” dialogues published by AAAS and NSF.

Anticipated Broad Impacts: This network will provide additional educational opportunities and resources for STEM education and improved opportunity for students to be prepared for graduate, technical and research careers. With 116 faculty members from 88 institutions already members of the GCAT-SEEK network, we anticipate impacting thousands of students via this project, with special focus on minority representation.

Intellectual Merits of Network• Community of enthusiastic biologists, with primary undergraduate teaching responsibilities• Intellectual synergies on experimental design, bioinformatics approach, pedagogy and assessment• Discounted runs, software• Dissemination of data, pedagogic, assessment modules• Outreach to Minority Serving Institutions• Database of barcoded metagenomic primers• Voice for student input: leadership training, presentations, participation• Cross-disciplinary interactions• Student Impact in Year 1: 28 research students, 95 students in labs

Standard Operating Procedure

As a result of faculty/student workshops, participants will be able to:1. Design experiments using next-generation sequencing

technologies 2. Prepare nucleic acid samples and assess quality3. Sequence and analyze their samples4. Teach modules that integrate next-generation sequencing

research into the classroom, and 5. Assess student learning goals and track outcomes

Day Setting Theme Content

2 Breakout Wet Lab Sample Prep3 Breakout Bioinformatics Assembly

4 AM Breakout Bioinformatics Annotation / Comparison

5 AM Group Faculty Presentations Faculty teaching modules

5 PM Group Student Presentations Student presentations

Proposed GCAT-SEEK workshop schedule and general content.

4 PM Group Assessing Student Learning Gains

Customizing and Using the SALG

1 PM Group NextGen Experimental Design

Platforms Experimental

Design

Vince Buonaccorsi, Juniata CollegeJeff Newman, Lycoming CollegeNancy Trun, Duquesne University

Tammy Tobin, Susquehanna UniversityDeborah Grove, Penn State University

Abstract

0

5

10

15

20

25

1 2 3 4 5

Freq

uenc

y

Low High

Linux Proficiency

05

101520253035

1 2 3 4 5

Freq

uenc

y

Low High

Perl or Python Proficiency

05

1015202530

0 1-5 6-10 11-15 16-20 21-25

Freq

uenc

y

Number of NextGen Data Sets Analyzed

Technology Expertise

• Relatively novice with respect to computer science or NextGen approaches

Teaching Experience

0

2

4

6

8

10

12

0 1-5 6-10 11-15 16-20 21-25 26-30 31-40

Fre

qu

ency

Years Teaching

Undergraduate Teaching Experience

05

1015202530

1-5 6-10 11-15

Freq

uenc

y

Years in GCAT

Years in GCAT

0

5

10

15

1 2 3 4 5

Freq

uenc

y

Low High

Familiarity with Assessment Literature

• Relatively experienced with respect to teaching

Who is GCAT-SEEK?

MSI14%

Non MSI86%

MSI Institions

Eukaryotic Genomics

26%

Bacterial Genomics

18%Metagenomics20%

Transcriptomics36%

NextGen Apps of interest

Biochem / Mol Bio / Genetics

78%

Bioinformatics5%

Evolution / Ecology

17%

Field of Teacher/Scholars

Animalia41%

Archaea2%

Bacteria16%

Fungi 13%

Plantae28%

Kingdoms of interest

0

5

10

15

20

25

30

35

1-1000 1001-5000 5001-10000 10001-20000 20001-30000

Freq

uenc

y

Number of Students

Number of undergraduates at school

• 14% from Minority Serving Institutions• Diverse organisms and applications of interest.• Predominantly BMB/Genetics/Microbiology faculty from small PUIs

Works Cited• Vision and Change in Undergraduate Biology Education: Preliminary Reports of

Conversations. July 2009.NSF-AAAS. www.visionandchange.org

Acknowledgements• NSF Award # DBI-1061893• HHMI award to Juniata College• Juniata College: Kresge Fund, Biology Dept, Provost

Examples of Student / Scientist Partnerships in Year 1

Large non-model Eukaryotic genomics

Sequence Genome

AssembleGenome

Create a Custom BLAST database

(Geneious) from the assembly

Download, study candidate gene sets(Uniprot/Genbank/ UCSC G.Browser)

Identify contigs in novel genome with homologyto candidate genomes (tBlastn in Geneious)

Literature Search

Formulate Specific Question

Collaborators

Students

Identify Full CDS in novel genome using the

MAKER2 web annotation pipeline

Extract Coding Sequences using

Galaxy/ Apollo

Align sequences, separate into clusters, generate a

phylogenetic tree (Geneious)

Calculate Ka/Ks ratio to determine positive

selection (Selecton, Ka/Ks calculator)

Identify contigs in novel genome with homologyto candidate genomes (tBlastn in Geneious)

Write MS

Large non-model Eukaryotic

transcriptomics

Bacterial genomics: Lycoming College

DownloadExome Trios from 1000 Genomes DB

Map against Human Ref using NextGENe

on GCAT-SEEK server

Pick a single gene and research prognosis of individual (HUGO DB)

Present with two other lab mates that picked different SNPs from

same individual:Prognosis

Advice

Use NextGENe viewer to examine data

Teacher

StudentsFilter differences

• Errors• Mode of inheritance• dbNSFP• Allele fqs

A G x A A A AC G A G C GMom Dad Child

Human genomics: Putative Freshman Lab

Conclusions• Our standard operating protocol should facilitate growth in membership, faculty

expertise, and student training.• Network members have diverse interests, low NextGen and bioinformatic

experience, but high teaching experience.• Year 1 examples of genomics work illustrate relative ease of projects involving

bacteria, collaboration with research intensive universities, and commercially supported software for novice users.

A student’s comparative analysis of transcriptome assembly methods. Geneious outperformed other methods in a 454 FLX+ low coverage (3X) dataset.

Pipeline successfully used by three students to explore targeted gene sets in the un-annotated Sebastes rockfish genome related to mate recognition and high speciation rates.

A student’s phylogenetic comparison of six uncharacterized pheromone receptors in Sebastes rubrivinctus (Sru) to three previously sequenced fishes. Further analyses showed no evidence of positive selection, which may occur in genes important to rapid speciation rates in the genus.

Isolate RNA/ Sequence

Transcriptome

Assemble TranscriptomeUsing Geneious, CLC Bio,

NextGENe

Student 1

Pipeline successfully used by students to annotate bacterial genomes

Venn Diagrams allowing correlation of metabolism and bacterial ecology

Putative pipeline to find and interpret differences between an individual and human reference genome.

Example of a screenshot and scenario of compound heterozygosity

Sample prep and deNovo transcriptome assembly pipeline used by a student

A student has successfully installed the Linux-based MAKER pipeline on the GCAT-SEEK server, which can be used by other network members, allowing whole genome annotations. The MAKER web annotation service can be used by novice students to learn the analysis.

Annotation of a single scaffold in S. rubrivinctus focused on the TERF1 gene. Polymorphisms in this gene may help explain negligible senescence in Sebastes rockfishes