friedberg lab-overview-grad-students

42
http://iddo-friedberg.net Twitter: @iddux The Friedberg Lab Bacterial genome evolution Protein function prediction Metagenomics Phenomics

Upload: iddo

Post on 15-Apr-2017

220 views

Category:

Science


0 download

TRANSCRIPT

http://iddo-friedberg.net Twitter: @iddux

The Friedberg Lab

Bacterial genome evolutionProtein function prediction

MetagenomicsPhenomics

http://iddo-friedberg.net Twitter: @iddux

About me

● 2003: PhD Hebrew University, Jerusalem● 2003-2006: Postdoc, Burnham Institute, CA● 2006-2009: Researcher, UC San Diego● 2009-2015: Assistant Professor, Miami

University, Ohio● 2015- Associate Professor, Iowa State

University

http://iddo-friedberg.net Twitter: @iddux

Friedberg Lab Members

Naihui Zhou

Nafiz Hamid

Xiao Hu

Ataur Katebi

Huy Nguyen

http://iddo-friedberg.net Twitter: @iddux

Lab philosophy

● Ask biological questions that have computational answers

● You may answer something else. That's great.

● There's treasure everywhere. But no data dredging

http://iddo-friedberg.net Twitter: @iddux

Starting question: how do complex structures evolve?

http://iddo-friedberg.net Twitter: @iddux

What is an operon?● Operons are (almost) unique to Prokaryotes.

Transcription

polycistronic mRNA

Translation

Gene 1 Gene 2 Gene 3Regulation

http://iddo-friedberg.net Twitter: @iddux

Gene Blocks

Transcription

mRNA transcripts

Translation

Gene 1 Gene 2 Gene 3

● Gene blocks are any suspected syntenic group of open reading frames (ORFs) which have a maximum allowed spacing. For my research this maximum is 500 nt.

http://iddo-friedberg.net Twitter: @iddux

Background

● Operons are an important feature in prokaryotic genetics.– Often contain full metabolic pathways.

● a set of chemical processes transforming one compound into another.

– Regulate groups of genes.– Allow for the frequent transfer of gene blocks

between organisms.● Therefore, studying operon evolution helps us

to understand metabolic pathway formation.

http://iddo-friedberg.net Twitter: @iddux

How we model changes in gene blocks

● We borrow ideas from sequence evolution, but genes are the atom of change.– Changes are called

events.

– There are more possible events modeling gene block evolution than in biological sequence evolution.

5' ATCCGA 3'

ATCCGT ATC-GA

http://iddo-friedberg.net Twitter: @iddux

Events investigated●Deletions

●Duplications

● Splits

Gaps exceeding 100 kbp are common

http://iddo-friedberg.net Twitter: @iddux

Results: Normalized interspecies event rate

http://iddo-friedberg.net Twitter: @iddux

Results: Normalized interspecies event rate

We asked: how do complex structures evolve?

We answered: the relationship between conservation and function.

http://iddo-friedberg.net Twitter: @iddux

Ancestry of Orthobolocks

1

2

3

4

5

A B C D E

A B C D E

A B C D E

A B C D E

A B C D ED

A B C D EA B C D E

http://iddo-friedberg.net Twitter: @iddux

Species tree shows ancestry

1

2

3

4

5

A B C D E

A B C D E

A B C D E

A B C D E

A B C D ED

A B C D EA B C D E

http://iddo-friedberg.net Twitter: @iddux

Ancestral resolution

1

2

3

4

5

A B C D E

A B C D E

A B C D E

A B C D E

A B C D EA B C D E

A B C D ED

A B C D EA B C D E

A B C D ED

A B C D ED

http://iddo-friedberg.net Twitter: @iddux

Science Olympics: Timed Challenges

http://iddo-friedberg.net Twitter: @iddux

Why CAFA?

“On the one hand, we have enormous “protein” databases that are replete with errors, wishful thinking, phantoms, and uncertainties. On the other, we have a tiny fraction of real proteins that have been studied in any depth.” –- Dan Graur

Biggest problem in molecular biology: < $1,000 genome,

BUT:

$20,000- >$10,000,000 annotation.

http://iddo-friedberg.net Twitter: @iddux

CAFA

● The Critical Assessment of Function Annotation

● Hundreds of scientists trying to predict protein function from sequence

● A friendly competition between scientific teams

The Protein function prediction problem

>sp|P04637|P53_HUMANMEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELPPGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPGGSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD

Cell differentiation

Apoptosis

Biological process

?

Apoptosis

Biological process

Cell differentiation

Biological process

PREDICTED: TRUE:

MEASURING FUNCTIONAL SIMILARITY

Precision: Recall:

http://iddo-friedberg.net Twitter: @iddux

CAFA1 vs. CAFA2CAFA 1 (2011) CAFA2 (2014)

Methods 54 129

Groups 29 57

Targets 50,000 108,000

Benchmarks ~840 3,681

Target types No knowledge No knowledge, partial knowledge

Ontologies MFO, BPO MFO,BPO,CCO,HPO

Target set choice Full mode Full mode, partial (>5000) mode

Assessment Fmax Fmax, Smin

http://iddo-friedberg.net Twitter: @iddux

Molecular Function, no-knowledge, full set

http://iddo-friedberg.net Twitter: @iddux

Biological Process, no-knowledge, full set

http://iddo-friedberg.net Twitter: @iddux

Cellular Component, no-knowledge, full set

http://iddo-friedberg.net Twitter: @idduxCytosol: the “cellular default”

http://iddo-friedberg.net Twitter: @iddux

Human Phenotype Ontology

Many terms (>80) per protein. Some easily predicted,Others not predicted at all.

http://iddo-friedberg.net Twitter: @iddux

“Is we learning?”

Molecular Function

http://iddo-friedberg.net Twitter: @iddux

“Is we learning?”

Biological Process

http://iddo-friedberg.net Twitter: @iddux

CAFA Team

Predrag Radivojac

Casey Greene

Sean Mooney

Maria Martin

Claire O'Donovan

http://iddo-friedberg.net Twitter: @iddux

Metagenomics

http://iddo-friedberg.net Twitter: @iddux

Bacteria naturally live...

http://iddo-friedberg.net Twitter: @iddux

...everywhere but...

http://iddo-friedberg.net Twitter: @iddux

...as communities

http://iddo-friedberg.net Twitter: @iddux

...metagenomic revolution

6,000,000 genes>6,000,000,000 BP~900 ribotypes

3,000,000 genes160,000,000 BP>1,100 ribotypes

http://iddo-friedberg.net Twitter: @iddux

What can we learn and how?

http://iddo-friedberg.net Twitter: @iddux

Babies are a Good Model for Humans (With Texas A&M)

http://iddo-friedberg.net Twitter: @iddux

Microbial community(metagenome)

Gut epithelial cells(transcriptome)

454 sequencing

Who? (Phylogenetic analysis)

What? (functional analysis)

Clean & RT

Codelink chip

http://iddo-friedberg.net Twitter: @iddux

Microbial community(metagenome)

Gut epithelial cells(transcriptome)

Altered Schaedler Flora

• Bacteria originate from mice– Unique niche

• Stable GI community• All cultivable in anaerobic chamber

• Advantages– Representative community

• Animals are healthy/normal growth– Specific host immune responses

• Antibodies• T-cell recall

– Evaluate entire GI microbiota• Quantitative changes• Spatial redistribution• Community gene expression • Mutation identification

ASF

The Altered Schaedler Flora (ASF) Greg Phillips & Michael Wannenmueller

Wannemuehler, et al. 2014. Genome Announc. 2.

1No Proteobacteria in ASF *Specific Pathogen Free

ASF*1

Table 1: Genome sequencing results of the ASF community ASF #

Taxonomy Genome Size (Mb)

%GC Gene count

Contig count

N50 (Kb)

Fold Coverage

Genbank Accession #

ASF356 Clostridium bacterium 2.91 30.91 2799 31 209 143 AQFQ00000000.1 ASF360 Lactobacillus intestinalis 2.01 35.86 1930 244 19 47 AQFR00000000.1 ASF361 Lactobacillus murinus 2.17 39.96 2102 78 59.7 160 AQFS00000000.1 ASF457 Mucispirillum schaedleri 2.33 31.15 2144 39 151 142 AYGZ00000000.1 ASF492 Eubacterium plexicaudatum 6.51 42.86 6217 248 74.4 119 AQFT00000000.1 ASF500 Pseudoflavonifractor sp. 3.70 58.77 3563 42 300 137 AYJP00000000.1 ASF502 Clostridium bacterium 6.48 47.90 6062 134 137 82 AQFU00000000.1 ASF519 Parabacteroides goldsteinii 6.87 43.45 5477 24 584 143 AQFV00000000.1

Mark Lyte's Lab: Microbial Endocrinology

Questions?