disease gene candidate prioritization by integrative biology table of contents:

28
Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Background Networks – deducing functional relationships from PPI data networks Protein interaction networks Functional modules / network clusters Phenotype association Grouping disorders based on their phenotype. Biological implications of phenotype clusters. Method and examples Integrating protein interaction data and phenotype associations in an automated large scale disease gene finding platform

Upload: taran

Post on 18-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Disease Gene Candidate Prioritization by Integrative Biology Table of contents:. Background Networks – deducing functional relationships from PPI data networks Protein interaction networks Functional modules / network clusters Phenotype association - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Disease Gene Candidate Prioritization by Integrative Biology

Table of contents:

Background

Networks – deducing functional relationships from PPI data networksProtein interaction networksFunctional modules / network clusters

Phenotype associationGrouping disorders based on their phenotype.Biological implications of phenotype clusters.

Method and examplesIntegrating protein interaction data and phenotype associations in an automated

large scale disease gene finding platform

Page 2: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Background

Page 3: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Background

Finding genes responsible for major genetic disorders can lead to diagnostics, potential drug targets, treatments and large amounts of information about molecular cell biology in general.

Page 4: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

BackgroundMethods for disease gene finding post genome era (>2001):

Mircodeletions Translocations

http://www.med.cmu.ac.th/dept/pediatrics/06-interest-cases/ic-39/case39.html

http://www.rscbayarea.com/images/reciprocal_translocation.gif

Linkage analysis

Fagerheim et al 1996.

1q21-1q23.1

chr1:141,600,00-155,900,000

Page 5: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

BackgroundAutomated methods for disease gene finding int the post genome era (>2001):

?

(Perez-Iratxeta, Bork et al. 2002) (Freudenberg and Propping 2002)(van Driel, Cuelenaere et al. 2005)(Hristovski, Peterlin et al. 2005)

Grouping:

Tissues, Gene Ontology, Gene Expression, MeSH terms …….

Page 6: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Disease Gene Finding.

Summery

Background

Why do we want to find disease genes, how has it been done until now?

Networks – deducing functional relationships from network theory

Protein interactionnetworksFunctional modules / network clusters

Phenotype association

Grouping disorders based on their phenotype.Biological implications of phenotype clusters.

Method and examples

Combining network theory and phenotype associationsin an automated large scale disease gene finding platformproof of concept.Status of pipeline / infrastructure

Page 7: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Networks and functional modules

Deducing functional relationships from protein interaction networks

Page 8: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

daily

weekly

monthly

(de Licthenberg et al.)

Networks

Social Networks, The CBS interactome

Page 9: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

daily

weekly

monthly

(de Licthenberg et al.)

Social Networks, The CBS interactome

Networks

Page 10: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Protein interaction networks of physical interactions.

(Barabasi and Oltvai 2004).

Networks

Page 11: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Extracting functional data from protein interaction networks

InWeb

Homo Sapiens

The Ach receptor involved in Myasthenic Syndrome.

Dynamic funcional module:

Eg:

Cell cycle regulation

Metabolism

Page 12: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Trans-organism protein interaction network

Orthologs?

Orthologous genes are direct descendants of a gene in a common ancestor:

(O'Brien K, Remm et al. 2005)

S.Cerevisiae

D. Melanogaster

H.Sapiens

Page 13: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

D. Melanogaster Experim.

C. Elegans Experim.

S. Cerevisiae Experim.

H.Sapiens MOSAIC

Trans-organism protein interaction network

Page 14: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Infrastructure status

BIND

IntAct

DIP

MINT

HPRD

Hand-curated

sets

PPI – pred.

GRID

InWeb

Homo Sapiens

Trans-organism ppi

pipeline>122.000 int.

> 22.000 genes

Scoring

A) Topological

B) No publ.

Extraction

perl modules

Direct SQL access

XML or SIF output

Web serverOpis

Command lineInweb.pl

CBS Datawarehouse

Download/reformat db’s

Page 15: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Protein interaction networks scoring the interactions

Number of methods that have shown the same interaction

Number of independent studies that have shown the same interaction

Number of common interaction partners

Cluster issues

Large scale / small scale issues

Page 16: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Disease Gene Finding.

Summery

Background

Why do we want to find disease genes, how has it been done until now?

Networks – deducing functional relationships from network theory

Protein interactionnetworksFunctional modules / network clusters

Phenotype association

Grouping disorders based on their phenotype.Biological implications of phenotype clusters.

Method and examples

Combining network theory and phenotype associationsin an automated large scale disease gene finding platformproof of concept.Status of pipeline / infrastructure

Page 17: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Phenotype association

Page 18: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Phenotype association

Absent liver peroxisomesHepatomegalyIntrahepatic biliary dysgenesisProlonged neonatal jaundicePyloric hypertrophyPatent ductus arteriosusVentricular septal defectsBell-shaped thoraxSmall adrenal glandsAbsent renal peroxisomesClitoromegalyCryptorchidismHydronephrosisHypospadiasRenal cortical microcystsFailure to thriveAbnormal electroretinogramAbnormal helicesAnteverted naresBrushfield spotsCataractsCorneal clouding

Epicanthal foldsFlat faciesFlat occiputGlaucomaHigh arched palateHigh foreheadHypertelorismLarge fontanellesMacrocephalyMicrognathiaNystagmusPale optic diskPigmentary retinopathyPosteriorly rotated earsProtruding tongueRedundant skin folds of neckRound faciesSensorineural deafnessTurribrachycephalyUpward slanting Hyporeflexia or areflexiaHypotonia

PolymicrogyriaSeizuresSevere mental retardationSubependymal cystsPulmonary hypoplasiaCubitus valgusDelayed bone ageMetatarsus adductusRocker-bottom feetStippled epiphyses (especially patellar and acetabular regions)Talipes equinovarusTransverse palmar creaseUlnar deviation of handsWide cranial suturesTransverse palmar creaseHeterotopias/abnormal migrationHypoplastic olfactory lobes

Zelwegger syndrome

palpebral fissuresAutosomal recessiveAlbuminuriaAminoaciduriaDecreased dihydroxyacetone phosphate acyltransferase (DHAP-AT) activityDecreased plasmologenElevated long chain fatty acidsElevated serum iron and iron binding capacityIncreased phytanic acidPipecolic acidemiaBreech presentationDeath usually in first year of lifeGenetic heterogeneityInfants occasionally mistaken as having Down syndromeAgenesis/hypoplasic corpus collosum

Page 19: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Phenotype association

Word vectors

Phenotype Sim. Score

Adrenoleukodystrophy (202370) 0.781

Hyperpipecolatemia (239400) 0.703

Cerebrohepatorenal Syndr. (214110) 0.682

Refsum Disease (266510) 0.609

Reference : Zelwegger Syndrome (214100)

A relationship between the infantile form of Refsum disease and Zellweger syndrome was suggested by the observations of Poulos et al. (1984) in 2 patients. In the infantile form of Refsum disease, as in Zellweger syndrome, peroxisomes are deficient and peroxisomal functions are impaired (Schram et al., 1986). Clinically, infantile Refsum disease, ZWS, and adreno-leukodystrophy have several overlapping features. (Stokke et al., 1984).(http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=266510)

214100 202370

Page 20: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Phenotype association

Word vectorsPhenotype association network

Cerebro-Hepato-

renal

Zelwegger

Refsum

Adrenoleuko-dystrophy

Page 21: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Disease Gene Finding.

Summery

Background

Why do we want to find disease genes, how has it been done until now?

Networks – deducing functional relationships from network theory

Protein interactionnetworksFunctional modules / network clusters

Phenotype association

Grouping disorders based on their phenotype.Biological implications of phenotype clusters.

Method and examples

Combining network theory and phenotype associationsin an automated large scale disease gene finding platformproof of concept.

Page 22: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Method –

Proof of concept

Page 23: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Method

InWeb

Homo Sapiens

Word vectors

Phenotype clustering

Page 24: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Results - Benchmark

MIM RANK GENE Probability TRUE

278800 1 ENSG00000032514 0.300326793109544 *278800 2 ENSG00000188611 0.0125655342047565278800 2 ENSG000001382970.0125655342047565278800 2 ENSG000001654060.0125655342047565278800 3 ENSG000001966930.0121357313793756278800 3 ENSG000001855320.0121357313793756278800 4 ENSG000001979100.00680983722337082278800 4 ENSG000001653830.00680983722337082278800 4 ENSG000001725380.00680983722337082. . . .. . . .. . . .. . . .. . . .278800 4 ENSG000001655110.00680983722337082278800 4 ENSG000001823540.00680983722337082278800 4 ENSG000001726610.00680983722337082278800 4 ENSG000001655070.00680983722337082278800 4 ENSG000001784400.00680983722337082278800 4 ENSG000001382990.00680983722337082278800 4 ENSG000001977040.00680983722337082278800 4 ENSG000000127790.00680983722337082278800 4 ENSG000001973540.00680983722337082278800 4 ENSG000001890900.00680983722337082278800 4 ENSG000001075510.00680983722337082278800 4 ENSG000001265420.00680983722337082278800 4 ENSG000001983640.00680983722337082278800 4 ENSG000001858490.00680983722337082278800 4 ENSG000001501650.00680983722337082278800 4 ENSG000001288150.00680983722337082278800 4 ENSG000001786450.00680983722337082278800 4 ENSG000001382930.00680983722337082278800 4 ENSG000001768330.00680983722337082278800 4 ENSG000001792510.00680983722337082278800 4 ENSG000001698260.00680983722337082278800 4 ENSG000001726780.00680983722337082278800 4 ENSG000001977520.00680983722337082278800 5 ENSG000001076430.00412573091718715278800 6 ENSG000001657330.000263885640603109

278800 7 ENSG00000169813 6,63E+07

DE SANCTIS-CACCHIONE SYNDROME

Gene map locus 10q11 >12MB area, 103 ranked genes

CLINICAL FEATURES

De Sanctis and Cacchione (1932) reported a condition, which they called 'xerodermic idiocy,' in which patients had xeroderma pigmentosum, mental deficiency, progressive neurologic deterioration, dwarfism, and gonadal hypoplasia.http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=278800

Page 25: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Results – Benchmarking

DE SANCTIS-CACCHIONE

SYNDROME Ranked 1

Probability: 0.300326793109544

DNA excision repair

protein ERCC-6

Eukaryotic translation initiation factor 4E (eIF4E)

DNA excision repair protein ERCC-2

Eukaryotic initiation factor 4A-I (eIF4A-I)

*126340 DNA REPAIR DEFECT EM9 OF CHINESE HAMSTER OVARY CELLS, COMPLEMENTATION OF; EM9

#133540 COCKAYNE SYNDROME CKN2

#278730 XERODERMA PIGMENTOSUM, COMPLEMENTATION GROUP D

#278800 DE SANCTIS-CACCHIONE SYNDROME

#601675 TRICHOTHIODYSTROPHY

Page 26: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Results – Benchmarking

DE SANCTIS-CACCHIONE

SYNDROME Ranked 2

Probability 0.0125655342047565

Page 27: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Disease Gene Finding.

Summery

Background

Why do we want to find disease genes, how has it been done until now?

Networks – deducing functional relationships from network theory

Protein interactionnetworksFunctional modules / network clusters

Phenotype association

Grouping disorders based on their phenotype.Biological implications of phenotype clusters.

Method and examples

Combining network theory and phenotype associationsin an automated large scale disease gene finding platformproof of concept.

Page 28: Disease Gene Candidate Prioritization by Integrative Biology Table of contents:

Acknowledgments

Disease Gene Finding :

Olga RiginaOlof Karlberg

Zenia M. Størling Páll Ísólfur Ólason

Kasper LageAnders GormAnders HinsbyYves Moreau

Niels TommerupSøren Brunak