the use of the concepts of evolutionary biology in genome (biological) annotation. pierre pontarotti...

78
The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique [email protected] http://www.up.univ-mrs.fr/evol/

Upload: elisabeth-serre

Post on 03-Apr-2015

103 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

The use of the concepts of evolutionary biology in genome (biological) annotation.

Pierre PontarottiEA 3781 Evolution Biologique

[email protected]://www.up.univ-mrs.fr/evol/

Page 2: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 3: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Somes Concepts in evolutionary biology

• Use of the concepts for• Gene Structural and functional annotation.

Informatisation

Others concepts

Page 4: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Metazoan Phylogeny ( Adoutte et al. 2000)Arthropods

Gastrotrichs

Nematodes

Onychophorans

TardigradesKinorhynchs

Priapulids

EC

DY

SO

ZO

AN

S

MolluscsRotifersAnnelidsGnathostomulidsSipunculansNemerteansPogonophoransPlatyhelminthesEntoproctsBryozoansBrachiopodsPhoronids

LO

PH

OT

RO

CH

OZ

OA

NS

VertebratesCephalochordatesUrochordates

HemichordatesEchinoderms

PR

OT

OS

TO

ME

SD

EU

TE

RO

ST

OM

ES

BIL

AT

ER

IA

CtenophoransCnidariansPoriferans

Urbilateria

??

Page 5: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

URBILATERIA : The hypothetical Metazoan AncestorGeoffroy de St Hilaire during XIX th Century

URBILATERIA Genome evolved by the fixation of :• Nucleotide substitution• Gene loss• Genic duplication

Gene duplication Genome region duplication Whole genome duplication Chromosomal rearrangement

Page 6: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Large scale gene duplication in vertebrate lineage

Deu

téro

stom

ata

Pro

tost

omat

a

Ver

tebr

ates

Amniota (Human)

Lisamphibia

Chondrichthyes (shark) Cephalaspidomorphi (lamprey)

Céphalochordata (amphioxus)

Echinodermata

Actinopterygii(Zebrafish)

Urochordata(Ciona)

Insects (Drosophila)

Myxini (Hagfish)

Nématod (c. elegans)

751

>751

564

528450

<833-993

833-993

T1

T2

360

20 000 genes

Pikaia

Page 7: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

I

A

B

C

D

Population :

POP 1

POP 1 split in

2 autonomous populations

A

B

C

D

A

B

C

D

POP 1A

POP 1B

Allele A fixation and accumulation of new mutations

A1

A2

B1

B2

Allele B fixation and accumulation of new mutations

From alleles to orthologsPoints mutations

Page 8: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

From alleles to orthologspoints mutations

POP 1A

POP 1B

A1

A2

A1

A2

B1

B2

B1

B2

POP 1A1

POP 1A2

POP 1B1

POP 1B2

A11

A12

A21

A22

B11

B12

B21

B22

POP 1B split in

2 autonomous populations

Allele A1 fixation and accumulation of new mutations

POP 1A split in

2 autonomous populations

Allele A2 fixation and accumulation of new mutations

Allele B1 fixation and accumulation of new mutations

Allele B2 fixation and accumulation of new mutations

Page 9: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

From alleles to orthologs

A.1.1

A.1.2

A.2.1

A.2.2

B.1.1

B.1.2

B.2.1

B.2.2

Alleles

Alleles

Alleles

Alleles

Orthologs

Page 10: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Orthologs and paralogs

A1/2 A3

A

A1 A2 A3 URBILATERIA

A2 A3’ A3”A1

HUMAN multigenic family

A1 A2 A3

DROSOPHILA multigenic family

A1, A2, B ParalogsDuplication

Speciation

Page 11: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Orthology/ Paralogy

Orthologs : 2 genes on different species Which come from a common ancestor and separated by a speciation event.

Paralogs : 2 genes resulting from a duplication event in a genome.

A1 HUMAN

A1 DROSO

A2 HUMAN

A2 DROSO

A3’ HUMAN

A3” HUMAN

A3 DROSO

Co-Orthologues

Duplication

Speciation

A

A1/2

A3

Page 12: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

From Gene History

To Gene Function

Page 13: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Orhologs under purifying selection

A

A

URBILATERIA

Speciation

Purifying Selection

DROSOPHILA

Ancestral Function

HUMAN

Ancestral Function

Purifying Selection

A

Page 14: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Ortholog functional switch

A

A2 A

URBILATERIA

SpeciationPurifying

Selection

DROSOPHILA

Ancestral Function

HUMAN

New Function ?

Positive selectionOr relaxed

Page 15: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Co-ortholog Sub Functionalization

A

A’ A

URBILATERIA

Speciation

Purifying Selection

DROSOPHILA

Ancestral Function

A”

Duplication

HUMAN

Sub-Function

HUMAN

Sub-Function

Page 16: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Co-ortholog Neo Functionalization

A

A A

URBILATERIA

Speciation

Purifying Selection

DROSOPHILA

Ancestral Function

A2

Duplication

HUMAN

Ancestral Function

HUMAN

New Function

Positive or relaxed Positive or relaxed selectionselection

Purifying Selection

Page 17: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Orthology /paralogy information

• is important for functional inference

• (forget for species with high level of horizontal transfer)

Page 18: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Orthology/ Paralogy

Orthologs : 2 genes on different species Which come from a common ancestor and separated by a speciation event.

Paralogs : 2 genes resulting from a duplication event in a genome.

A1 HUMAN

A1 DROSO

A2 HUMAN

A2 DROSO

A3’ HUMAN

A3” HUMAN

A3 DROSO

Co-Orthologues

Duplication

Speciation

A

A1/2

A3

Page 19: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Many scientists are using the best BLAST hit to look for orthologous relationship

A Warning that will be discussed by other intervenants

… BUT!

Many co orthologs can be present

Problem with genomes that are not fully sequencedOr when gene loss occurred

AND

Even with Phylogenetic analysis :• Bias must be corrected. • A phylogenetic tree is hypothetical

Page 20: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Evolutionary shift (due to positive or relaxed selection) could be linked to functional shift .See N Galtier and A Levasseur talks.

Page 21: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 22: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Detection of Positive selection and functional shift

Page 23: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 24: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Detection of Evolutionary constraint relaxation and functional shift

Page 25: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Co-ortholog Neo Functionalization

A

A A

URBILATERIA

Speciation

Purifying Selection

DROSOPHILA

Ancestral Function

A2

Duplication

HUMAN

Ancestral Function

HUMAN

New Function

Purifying Selection

Page 26: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 27: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Constitutive proteasome β-subunits replacement after Interferon-γ stimulation

Paralogue = duplicated gene

Constitutive Proteasome Immuno-Proteasome

Paralogue replacement

PSMB8 (LMP 7)

PSMB9 (LMP 2)

PSMB10 (LMP Z)

PSMB5

PSMB6

PSMB7

• New function (specialization) (Specific size protein or peptide degradation – used by MHC system)

• Only found in vertebrates

• Ancestral function : Protein degradation• Present in all Metazoans, therefore

present in Urbilateria (Metazoan ancestor).

Page 28: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Large scale gene duplication in vertebrate lineage

Imm

uno

Pro

teas

ome

Pro

teas

ome

Deu

téro

stom

ata

Pro

tost

omat

a

Ver

tebr

ates

Amniota (Human)

Lisamphibia

Chondrichthyes (shark) Cephalaspidomorphi (lamprey)

Céphalochordata (amphioxus)

Echinodermata

Actinopterygii(Zebrafish)

Urochordata(Ciona)

Insects (Drosophila)

Myxini (Hagfish)

Nématod (c. elegans)

751

>751

564

528450

<833-993

833-993

360

PROTEASOME

Page 29: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

PSMB7 Mus PSMB7 Ratt

PSMB7 Bos PSMB7 Homo PSMB7 Gall

PSMB7 Xeno PSMB7 Zebra

PSMB7 Fugu PSMB10 Zebra

PSMB10 Fugu PSMB10 Bos

PSMB10 Mus PSMB10 Homo

PSMB7/10 Bran PSMB7/10 Ci-zeta Cionai

PSMB7/10 BombyxPSMB7/10 Prosbeta2

PSMB7/10 CG18341 Drosophila

62100

100

4495

93

78

599558

88

98100

5280

0.1

**

*

74 99

100*

*69

9995

* *

62

*

*

76

80

**

9578

93

9191

5958

75 *

*Duplication

Page 30: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

The study genes and genomes HISTORY.

Help to find evidences for gene FUNCTION.

Page 31: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Concepts in evolutionary biology

• Use of the concepts for • Structural and functional annotation.

Structural annotation (deciphering of gene structure). Functional annotation (especially the use of

phylogeny to decipher proteins function).

.

Page 32: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Biochemical and Biological process :

• Experimental approach : RNA Interference Tandem affinity purification and mass spectrometry

• In Silico

Functional annotationFunctional annotation

Page 33: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Functional Annotation

Based on phylogeny. from experimentally annotated genes…

Functional annotationFunctional annotation

Page 34: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

INTERLUDE

• FUNCTION????

• A complex concept;

Page 35: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Function Prediction Using orthology information (done)

Using the evolutionary shift information (in progress)

Function prediction by Integrative phylogenomics (Engelhardt et al

PLOS Computional biology 2005) (in progress)

Page 36: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Homologs with experimentally known function: how information can be found.

Gene Ontology

MedLine

SwissProt

Textual Information Analysis

G.O. Standard

GenBank

Functional annotationFunctional annotation

Page 37: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Biological process – biological process to which the gene or gene product contributes. Cell growth and maintenance; pyrimidine metabolism; …

• Molecular function – biochemical activity, including specific binding to ligands or structures, of a gene product. Enzyme, transporter; Toll receptor ligand, …

• Cellular component – place in the cell where a gene product is active. Cytoplasm, ribosome, …

. Plus others classifications to develop:In particular evolutionary based ontology

Functional annotationFunctional annotation

Gene Ontology Classification

Page 38: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 39: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Small fraction correspond to known, well-characterized proteins.

If the function is unknown : Phylogenetic analysis :

Functional prediction:

Using orthology information

Using the evolutionary shift information

by integrative Phylogenomics

Page 40: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Tumor necrosis factor family Phylogenetic tree :Orthologs identification

GgaTNFSF10DreTNFSF10

HsaTNFSF10PolTNFSF11

HsaTNFSF11XlaTNFSF11

GgaTNFSF5

HsaTNFSF5BboTNFSF5

MmuTNFSF2HsaTNFSF2

MmuTNFSF1HsaTNFSF1

MmuTNFSF15

HsaTNFSF15HsaTNFSF14MmuTNFSF14

HsaTNFSF6RnoTNFSF6

HsaTNFSF13MmuTNFSF6

GgaTNFSF13

PolTNFSF13MmuTNFSF7HsaTNFSF7

HsaTNFSF8MmuTNFSF8

HsaTNFSF9MmuTNFSF9

EIGER (DmeTNF)

9996

73

7879

95

9999

79

MmuTNFSF598

96

99

99

99

99

88

99

69

74

55

5897

9968

99

99

0,2

DF1

DF2

DF3

Trends in Immunology (July 2003)

Atherosclerotic plaque

formation

ALPS - LPR/GLD

Lymphoproliferative syndrome

Page 41: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

TNFSF1

TNFSF2

TNFSF3

TNFSF14

TNFSF6

TNFSF10

TNFSF11

TNFSF5

TNFSF13BTNFSF13

TNFSF12?

TNFSF9

TNFSF8

TNFSF7

TNFSF18

TNFSF4

EDA-A1

EDA-A2

TNFSF15

LN, PP, GC, Tumorocidal activity

T cell Homeostasis (death)

T cell Homeostasis (death), CTL function,peripheral tolerance, T cell costimulation, chemotaxis

LN, bone Homeostasis, mammary gland development

T cell Homeostasis (survival?), CTL activation,peripheral tolerance?

T cell homeostasis (survival), peripheral tolerance

T cell activation?

T cell activation and survival, CTL activity, Tumorocidal actvity?

?Tooth, hair, sweat gland formation

Tooth, hair, skin formation?

PP, GC, T cell Homeostasis (death)

T cell transmigration and homeostasis (survival)?

GC, B cell function, peripheral tolerance, T cell priming

Tumorocidal activity, T cell function?Tumorocidal activity, T cell function?

Negative selection, autoimmunity

?

?

T cell costimulation, negative selection?

B cell HomeostasisB cell Homeostasis ?B cell Homeostasis

TNFRSF1A

TNFRSF1B

TNFRSF3

TNFRSF14TNFRSF6B

TNFRSF11A

TNFRSF5

TNFRSF11B

TNFRSF17

TNFRSF9

TNFRSF8

TNFRSF6

TNFRSF10BTNFRSF10ATNFRSF10CTNFRSF10D

TACI

TNFRSF7

TNFRSF18

TNFRSF4

TNFRSF19

EDAR

XEDAR

TNFRSF21

RELT

TNFRSF12

BR3

Molecular Function Biological Process

Human TNF family Phylogenetic tree :Search for the closest Paralog

Functional annotationFunctional annotation

Trends in Immunology (July 2003)

Page 42: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Small fraction correspond to known, well-characterized proteins.

If the function is unknown : Phylogenetic analysis :

Gene function prediction:

Using orthology information Using the evolutionary shift

information ( see Levasseur talk) by integrative Phylogenomics

Page 43: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 44: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

evolutionary biology concepts for genome annotation

Further reading

Concepts

Levasseur A, Danchin E, Orlando L, Bailly X, Pontarotti P. Conceptual bases for quantifying the role of the environment on genomes evolution: the participation of positive selection and neutral evolution Biological review in press

Danchin E.G.J, et al. The Major Histocompatibiliy Complex Origin Immunological reviews. 2004;198(1):216-232.

Concepts for applied evolution Danchin E.G.J, Levasseur A, Lopez-Rascol V, Gouret P, Pontarotti P. The use

of evolutionary biology concepts for genome annotation. J. Exp. Zoology Part B: Mol. and Dev. Evol. 2006

Page 45: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 46: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Informatisation des concepts et connaissances

• Phylogénie

• Détection des gènes orthologues et paralogues

• Détection de changements évolutifs (en cours)

• Prévision de fonctions

Page 47: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

FIGENIX est une plate-forme logicielle multi-utilisateur dédiée aux taches d'annotation structurales et fonctionnelles:

- Prédictions de gènes pour de grandes séquences d'ADN

- Construction d'arbres phylogénétiques robustes

- Détection automatique d'orthologues et de paralogues

- Recherche automatique de données fonctionnelles sur les gènes disponibles à partir de bases de données « Web »

- Filtrage et construction de bases de données protéiques (contigage d'EST)

- Processus chainés(ex: Prédiction de gènes suivie d'études phylogénétiques

pour chacun)

Page 48: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

ETAPES DU PIPELINE de Phylogénie (1)

EnsemblNR…

Séquence protéique codée par un gène putatif

BLAST + filtrage

CLUSTAL W + purification + correction de biais

Alignement multiple

Conservation « repeats »

monophylétiques

Alignement « repeats » fusionnés

Test de composition par TREEPuzzle pour

élim séq trop divergentes

Construction Arbre de la Vie

PFAM

Recherche de domaines par HmmPFAM

Création domaine « FIGENIX » (correctDomains)

Conservation alignement complet

Existence « repeats »?

N

O

Arbre de référence

Enumération domaines

Page 49: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Détection « groupes de paralogie » + élim sites qui évol trop vites (« test de Gu »)

Élim séq >30% « gaps »

Élim domaines les + non congruents détectés par HomPart de PAUP

Test de saturation

NJ Parcimonie Maximum de vraisemblance

Comparaison topologies par tests Templeton-Hasegawa

Topologies congruentes?

Arbre NJ Arbre consensus

Détection orthologuesI

recherche de fonctions

ETAPES DU PIPELINE de phylogénie (2)

arbre arbre arbre

Construction Arbre de la Vie

Arbre de référence

ON

Page 50: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 51: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Architecture de FIGENIX

RDBMS

Expert SystemGenomic

Data Annotation Engine

Web Server

Persistence Layer

RepositoryLoad Balancing, Security, ...

Archiver

Request

Data exchange

MGIAgent

GOAgent

ESTAgent

Functional Collector Agent

- plate-forme Intranet/Extranet

-architecture 3 tiers (interface web/ serveurs “métier” / base de données)

Page 52: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

1)

Page 53: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 54: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Further reading:about concepts informatisation

• Gouret et al.FIGENIX: intelligent automation of genomic annotation: expertise integration in a new software platform. BMC Bioinformatics. 2005 Aug 5;6:198

• Balandraud et al. A rigorous method for multigenic families' functional annotation: the peptidyl arginine deiminase (PADs) proteins family example BMC Genomics 2005, 6:153     

Page 55: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Further reading on FIGENIX utilization

• Danchin et al . Eleven ancestral gene families lost in mammals and vertebrates while otherwise universally conserved in animals BMC Evolutionary Biology 2006, 6:5

• Paillisson et al . Bromodomain testis-specific protein is expressed in mouse oocyte and evolves faster than its ubiquitously expressed paralogs BRD2, -3 and -4. Genomics. 2006

• Levasseur et al Tracking the evolutionary and functional shifts connection: the lipase-esterase example.BMC evolutionary biology 2006

Page 56: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 57: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Structural annotation

Page 58: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Genome nucleotide-level Annotation :

• Mapping• Finding genomic landmarks

• Gene finding and protein prediction• Non-coding RNAs and regulatory regions• Identifying repetitive elements• Mapping segmental duplications• Mapping variations (SNP, microsatellites,

….)

Structural annotationStructural annotation

Page 59: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Available tools

Ab initio :• Genscan• Fgenesh• Genie• Etc …

Similarity Based :• Genewise• Sim4• Est2genome• Figenix

Based on statistical signals within the DNA. Coding propensity (hexamer signals).Splice Site Signals.Strengths :

Easy and quick to run. Only need DNA as input.

Weakness : High false positive rate.

Alignement programs that know about gene structure.Very accurate with strong sequence similaritiesStrengths : Accurate.Weakness : Need strong similarities, slow to run.

Structural annotationStructural annotationState of the Art

Page 60: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Structural Annotation

combining together a statistical and homologous approach (similarities with known proteins). The process automation resulted in an expert system based on biological inference rules using gene history and ab-initio program. But yet not completely evolutionary biology based

« FIGENIX SOFTWARE PLATFORM » Annotating method Structural annotationStructural annotation

Page 61: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

segment ADN

protéine A(meilleur hit région 1) protéine B

(meilleur hit région 2)

région 1 région 2

hsp: A1 hsp: A2

hsp: A3

hsp: B1

hsp:B2

Page 62: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

DM SD A D D D DA A DAA D A+

DA A A

Page 63: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Protein = amino acid sequence

Gene = nucleotidic sequence

mRNA = nucleotidic sequence

P

Transcription

Traduction

Figenix  : 87%Figenix  : 87%

Genscan : 31%

HMMGene : 38%

Sequence

Protein

Validation of structural annotationValidation of structural annotation

The platform performances were validated on standard dataset (HMR195) see Guigò et al, 2000; Rogic et al, 2001.

Page 64: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

0.87

0.38

0.31

CORRECT PROTEIN

PREDICTION

0.220.650.800.55Genscan

0.050.950.920.91Figenix

0.150.780.810.75Hmmgen

OVER PREDICTION

Terminal

(55)

Internal

(186)

Initial

(55)

EXON TYPEPROGRAMS

Accuracy versus Exon Type and Prediction

The Mouse and Rat sequence from the HMR195 dataset was used on the human division of swissprot.

Structural annotationStructural annotation

Page 65: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• The next step for structural annotation :

• Is to take into account the gene evolutionary history

Page 66: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Concepts , modélisation, informatisation, bio-annalyse

Structural annotation (deciphering of gene structure).

Functional annotation (especially the use of phylogeny to decipher proteins function).

Page 67: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Next

• Phylogenomics (genome Evolution)

• Phylopostgenomics

• - phylotranscriptomics

• - phylointeractomics

• ………..

Page 68: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr
Page 69: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Connaissances/concepts

Observation : il existe des régions de syntenies conservées entre espèce.

Explication /concept : ces régions proviennent d’une région ancestrale qui a évoluée de manière indépendante après spéciation dans chaque lignée, mais pas assez pour perdre toute trace de conservation. A partir de cette connaissance et de cette prédiction que découle un ensemble de réflexion qui indique que les analyses des synténies conservées et la reconstruction de régions ancestrales sont intéressantes, d’un point de vu appliqué : assistance au clonage positionnel et d’un point de vue conceptuel : compréhension de l’évolution des génomes.

Formalisation de la question biologique

Comment mettre en évidence les synténies conservées ?

C’est aussi à ce moment que la conceptualisation prend toute sa place

Si les synténies conservées proviennent vraiment d’une région ancestrale, les gènes dans ces régions doivent avoir

ll faut donc avoir des programmes qui soient capables de mettre en évidence les relations d’orthologie, et de trouver des clusters significatifs.

Reconstruction des génomes (translocation, fusion inversion… pondération de ces événements)

1/ des relations d’orthologie

2/ le regroupement des gènes orthologues doit être improbable sous l’hypothèse du hasard (le regroupement doit être significatif).

Page 70: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Modélisation mathématique Il faut modéliser dans le cas ou les outils informatiques

n’existent pas ou dont le formalisme biologique n’est pas correct. Ce qui est le cas pour les tests statistiques de regroupement (la taille des famille de in-paralogues en particulier).

Modéliser la reconstruction des génomes Formalisation informatique 1)AlgorithmesTests statistiquesModélisation reconstruction ancestrale des génomes2) Intégration avec les autres outils « informatique »

dans le système informatique (CASSIOPE)

Page 71: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Bioanalyse• Recherche automatique de synténies

conservées.• Reconstruction et évolution de régions

génomique• Nouvelle connaissance et nouveaux

concepts• Application directe : • aide au clonage positionnel• Concepts/connaissance:• Mise en évidence de regroupement fonctionnel

Page 72: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

C.A.S.S.I.O.P.E

• C.A.S.S.I.O.P.E: Clever Agent System for Synteny Inheritance and Other Phenomena in Evolution

• find conserved regions between genomes

For more info see Virginie Lopez Rascol

Page 73: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

C.A.S.S.I.O.P.E.

Page 74: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

• Toward the ancestral genome reconstruction

Page 75: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Toward the ancestral genome reconstruction

Page 76: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

C.A.S.S.I.O.P.E

• Bioanalyse• Recherche automatique de synténies

conservées.• Reconstruction et évolution de régions

génomique• Nouvelle connaissance et nouveaux

concepts• Application directe : • aide au clonage positionnel• Concepts/connaissance:• Mise en évidence de regroupement fonctionnel

Page 77: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

CollaborateursProjet MEG* (Modèlisation Evolution Génome)

Nathalie Balandraud Etienne Danchin Philippe Gouret Vérane Vitiello

• Math/bio• Julien Berestycki* Simona Grusea* • Stéphanie Léocard* Valda Limic *• Laure Rigal* Etienne Pardoux*

• Info/bio• Olivier Chabrol* Virginie Lopez* • Cedric Notredame*

• Concepts et bio-analyse• Roxane Barthelemy * Jean, Paul Casanova*• Elodie Darbo* Anthony Levasseur* • Eric Faure* Pierre Pontarotti*

http://www.up.univ-mrs.fr/evol/

Page 78: The use of the concepts of evolutionary biology in genome (biological) annotation. Pierre Pontarotti EA 3781 Evolution Biologique pontarot@up.univ-mrs.fr

Open Discussion

Phylo postgenomic