prediction of protein-protein interactions in deinococcus radiodurans r1 strain using computational...

28
Prediction of Protein-Protein Prediction of Protein-Protein Interactions in Interactions in Deinococcus Deinococcus radiodurans radiodurans R1 strain using R1 strain using computational methods computational methods Selvakumar E.* Selvakumar E.* Bioinformatics Centre, Bioinformatics Centre, Pondicherry University, Pondicherry University, Pondicherry -605 014. Pondicherry -605 014. * - Presenting Author * - Presenting Author

Upload: anis-owen

Post on 17-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Prediction of Protein-Protein Prediction of Protein-Protein Interactions in Interactions in Deinococcus Deinococcus radioduransradiodurans R1 strain using R1 strain using

computational methodscomputational methods

Selvakumar E.*Selvakumar E.*

Bioinformatics Centre, Bioinformatics Centre, Pondicherry University, Pondicherry University, Pondicherry -605 014.Pondicherry -605 014.

* - Presenting Author* - Presenting Author

Page 2: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Protein-Protein Interaction for Protein-Protein Interaction for Functional Linkage AnalysisFunctional Linkage Analysis

Complete understanding of protein function Complete understanding of protein function requires the identification of interacting requires the identification of interacting partners, interacting subunit if protein is a partners, interacting subunit if protein is a component of molecular complex and pathway. component of molecular complex and pathway. Knowledge of these relationships which we call Knowledge of these relationships which we call ‘‘Functional LinkageFunctional Linkage’ is a prerequisite for ’ is a prerequisite for understanding biology of an organism.understanding biology of an organism.

Page 3: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

In silico In silico approachesapproaches

Gene Neighbor method Gene Neighbor method (Tamames (Tamames et alet al., 1997)., 1997)

Phylogenetic Profile Phylogenetic Profile (Pellegrini (Pellegrini et alet al., 1999)., 1999)

Rosetta Stone method Rosetta Stone method (Eisenberg (Eisenberg et alet al., 2000)., 2000)

Page 4: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Study Organism Study Organism Deinococcus radioduransDeinococcus radiodurans

D. radioduransD. radiodurans, literally meaning ", literally meaning "strange berry that strange berry that withstands radiationwithstands radiation," have since been classified as ," have since been classified as members of the family Deinococcaceae. A Gram-positive, members of the family Deinococcaceae. A Gram-positive, red-pigmented, non motile bacterium. red-pigmented, non motile bacterium. (Anderson (Anderson et alet al., 1956)., 1956)

All species in the genus All species in the genus DeinococcusDeinococcus, in particular , in particular D. D. radioduransradiodurans, are , are extremely resistant to a number of agents extremely resistant to a number of agents and conditions that damage DNA, including ionizing and and conditions that damage DNA, including ionizing and ultraviolet (UV) radiation and hydrogen peroxide.ultraviolet (UV) radiation and hydrogen peroxide. (Minton, (Minton, 1994)1994)

The radiation resistance of The radiation resistance of D. radioduransD. radiodurans makes it an makes it an ideal candidate for bioremediation of sites contaminated ideal candidate for bioremediation of sites contaminated with radiation and toxic chemicalswith radiation and toxic chemicals..

D. radiodurans viewed by electron microscopy,

courtesy M Daly

Page 5: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

D. radioduransD. radiodurans Genome Genome

The The D. radioduransD. radiodurans genome sequence was genome sequence was determined by the random whole-genome determined by the random whole-genome shotgun method shotgun method (White (White et alet al., 1999)., 1999)..

General features of the General features of the D. radioduransD. radiodurans genome. genome.

MoleculeMolecule LengthLength

Average Average ORF ORF

length length (bp)(bp)

Protein Protein coding coding regionsregions

GC GC ContentContent

Repeat Repeat ConteConte

ntnt

ChromosomChromosome I e I

2,648,632,648,6388

913913 90.8%90.8% 67.0%67.0% 1.8%1.8%

ChromosomChromosome II e II 412,348412,348 1,0441,044 93.5%93.5% 66.7%66.7% 1.4%1.4%

Mega Mega plasmid plasmid 177,466177,466 1,1001,100 90.4%90.4% 63.2%63.2% 9.2%9.2%

Plasmid Plasmid 45,70445,704 928928 80.9%80.9% 56.1%56.1% 13.0%13.0%

All All 3,284,13,284,15656 937937 90.9%90.9% 66.6%66.6% 3.8%3.8%

Page 6: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Gene Neighbor MethodGene Neighbor Method

If in several genomes the genes that encode two If in several genomes the genes that encode two proteins are neighbors on the chromosome, the proteins are neighbors on the chromosome, the proteins tend to be functionally linked.proteins tend to be functionally linked.

The Gene Neighbor (GN) method identifies protein The Gene Neighbor (GN) method identifies protein pairs encoded in close proximity across multiple pairs encoded in close proximity across multiple genomes.genomes.

Here gene A and B are neighbor while A and C are not.

Page 7: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Gene Neighbor MethodGene Neighbor Method

Page 8: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

CLOSE CLOSE Any pair of genes occurring within a single run is called ‘‘close.’’Any pair of genes occurring within a single run is called ‘‘close.’’

RUN RUN A set of genes occurring on a prokaryotic chromosome will be A set of genes occurring on a prokaryotic chromosome will be

called a ‘‘run’’ if and only if they all occur on the same strand called a ‘‘run’’ if and only if they all occur on the same strand and the gaps between adjacent genes are 300 base pairs or less.and the gaps between adjacent genes are 300 base pairs or less.

BIDIRECTIONAL BEST HIT (BBH) BIDIRECTIONAL BEST HIT (BBH) Given two genes Given two genes Xa Xa and and Xb Xb from two genomes from two genomes Ga Ga and and GbGb, , Xa Xa

and and Xb Xb are called a ‘‘bidirectional best hit (BBH)’’ if and only if are called a ‘‘bidirectional best hit (BBH)’’ if and only if recognizable similarity exists between them (in our case, we recognizable similarity exists between them (in our case, we used blastp with e value 0.001), there is no gene used blastp with e value 0.001), there is no gene Zb Zb in in Gb Gb that is that is more similar than more similar than Xb Xb is to is to XaXa, and there is no gene , and there is no gene Za Za in in Ga Ga that that is more similar than is more similar than Xa Xa is to is to XbXb. .

PAIR OF CLOSE BIDIRECTIONAL BEST HITS (PCBBH)PAIR OF CLOSE BIDIRECTIONAL BEST HITS (PCBBH) Genes (Genes (XaXa, , YaYa) from ) from Ga Ga and (and (XbXb, , YbYb) from ) from Gb Gb form a ‘‘pair of form a ‘‘pair of

close bidirectional best hits (PCBBH)’’ if and only if close bidirectional best hits (PCBBH)’’ if and only if Xa Xa and and Ya Ya are close, are close, Xb Xb and and Yb Yb are close, are close, Xa Xa and and Xb Xb are a BBH, and are a BBH, and Ya Ya and and Yb Yb are a BBH.are a BBH.

Page 9: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

ObjectivesObjectives

To identify the closes, and runs from the To identify the closes, and runs from the D. radioduransD. radiodurans

To locate the bidirectional best hit (BBH) To locate the bidirectional best hit (BBH) from the genomes takenfrom the genomes taken

To locate the Pair of Close Bidirectional To locate the Pair of Close Bidirectional Best Hit (PCBBH) from the genomes Best Hit (PCBBH) from the genomes takentaken

To find out conserved neighbor gene pairsTo find out conserved neighbor gene pairs..

Page 10: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Tools & DatasetTools & Dataset

ORGANISM NAMEORGANISM NAME REF SEQ REF SEQ NO.NO.

CHROMOSOMCHROMOSOMEE

NO. OF bpNO. OF bpNO. OF NO. OF PROTEINPROTEIN

SS

D. radiodurans R1D. radiodurans R1 NC_001263NC_001263 II 26486382648638 26292629

D. radiodurans R1D. radiodurans R1 NC_001264NC_001264 IIII 412348412348 368368

E. coli K12E. coli K12 NC_000913NC_000913 II 46396754639675 42424242

Halobacterium Halobacterium NRC .1NRC .1

NC_002607NC_002607 II 2014239201423920752075

H. pylori J99H. pylori J99 NC_000912NC_000912 II 16438311643831 14911491

T. thermophilus T. thermophilus HB8HB8

NC_006461NC_006461 II 1849742184974219731973

S.NS.Noo

ProgramProgram PurposePurpose

11223344

Finding_Pairs2.javFinding_Pairs2.javaaBlast. javaBlast. javaNew_Compare_2.jaNew_Compare_2.javavaCompare_DatabasCompare_Database.javae.java

To extract the ‘Neighbor Genes’ To extract the ‘Neighbor Genes’ from .ptt filesfrom .ptt filesTo extract the Best hit from the blast To extract the Best hit from the blast resultsresultsTo extract the protein id no from the To extract the protein id no from the above outputabove outputTo extract the BBH from the above To extract the BBH from the above outputoutput

Model organisms taken from NCBI in ptt format

Page 11: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

MethodologyMethodology To identify the neighbor genes (close)To identify the neighbor genes (close)

the genomic protein table the genomic protein table (.ptt files(.ptt files) of all the five ) of all the five organisms, we used in house organisms, we used in house java programjava program, which finds all , which finds all the neighbor genes within the the neighbor genes within the thresholdthreshold level (distance) of level (distance) of 300 bp, 275bp, and 250 bp300 bp, 275bp, and 250 bp..

To identify the Gene Clusters (Runs)To identify the Gene Clusters (Runs) we we joined the gene neighborsjoined the gene neighbors which come continuously. which come continuously.

To Identify theTo Identify the Bidirectional Best HitsBidirectional Best Hits (BBH)(BBH) we searched for the same ‘close’ of proteins by we searched for the same ‘close’ of proteins by manual manual

comparison of the protein namescomparison of the protein names which are present in the which are present in the other four genomes. other four genomes.

All these ‘close’ sequences were retrieved from the NCBI All these ‘close’ sequences were retrieved from the NCBI genpept database in FASTA format by using the retrieval genpept database in FASTA format by using the retrieval tool ENTREZ, for all the five organisms. tool ENTREZ, for all the five organisms.

To screen the BBHs from these closes, we To screen the BBHs from these closes, we compared these compared these close sequences of close sequences of D. radioduransD. radiodurans, with close sequences of , with close sequences of other four organismsother four organisms genomic proteins as database genomic proteins as database and also and also reciprocallyreciprocally..

Page 12: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

ResultsResults ClosesCloses

By setting the threshold distance as 300 bp, 275 bp and 250 By setting the threshold distance as 300 bp, 275 bp and 250 bp bp chromosome-I chromosome-I 1314 1314 closescloses

chromosome-II chromosome-II 183 183 closes closes RunsRuns

It is found that the all the genes are present as continuous It is found that the all the genes are present as continuous stretch stretch i.e. a single run.i.e. a single run.

BBHBBHOrganisms NameOrganisms Name Number of BBHNumber of BBHD. radioduransD. radiodurans and and E. coliE. coli 99D. radioduransD. radiodurans and and H. salinarumH. salinarum 44D. radiodurans D. radiodurans and and H. pyloriH. pylori 1616

D. radiodurans D. radiodurans and and T. thermophilusT. thermophilus 2424 PCBBHPCBBH

By comparing the BBH Closes we are able to identify only one By comparing the BBH Closes we are able to identify only one pair of PCBBH between the genomes of pair of PCBBH between the genomes of D. radiodurans D. radiodurans andand T. T. thermophilusthermophilus

Page 13: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

ResultsResults

ORGANISM NEIGHBOR PIDORGANISM NEIGHBOR PID ENCODED PROTEINSENCODED PROTEINS

D. radiodurans D. radiodurans 1580703915807039 Ribosomal Ribosomal Protein L1Protein L1

1580704115807041 Transcription Transcription Anti-Anti-

termination Protein termination Protein NusG NusG

T. thermophilusT. thermophilus 5598021555980215Ribosomal Protein L1Ribosomal Protein L1

55980217 55980217 Transcription Transcription

Indetermination Protein Indetermination Protein NusGNusG

Page 14: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

SummarySummary

all the genes are found as neighbor (closes) - all the genes are found as neighbor (closes) - “gene “gene density” density” is very high…is very high…

very few genes are conserved with their neighbor…very few genes are conserved with their neighbor…

Phylogenetic studies of highly conserved genes Phylogenetic studies of highly conserved genes have suggested that the have suggested that the DeinococciDeinococci are most are most closely related to the closely related to the ThermusThermus genus and that genus and that these two lineages form a eubacterial phylumthese two lineages form a eubacterial phylum (Hensel (Hensel et alet al., 1995)., 1995) and and some some Deinococci Deinococci are slightly are slightly thermophilic suggests that the common ancestor of thermophilic suggests that the common ancestor of the the Deinococcus-ThermusDeinococcus-Thermus group was thermophilic. group was thermophilic.

Page 15: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Phylogenetic ProfilingPhylogenetic Profiling

To assign functions to genes based on their patterns To assign functions to genes based on their patterns of inheritance across species. of inheritance across species.

The hypothesis behind this method is that proteins The hypothesis behind this method is that proteins that participate in a common structural complex or that participate in a common structural complex or metabolic pathway evolve in correlated fashion, and metabolic pathway evolve in correlated fashion, and should therefore be present in the same organisms.should therefore be present in the same organisms.

This method is based on the pattern of the presence This method is based on the pattern of the presence or absence of a given gene in a set of organism or absence of a given gene in a set of organism (Gasterland and Ragan, 1998; Pellegrini (Gasterland and Ragan, 1998; Pellegrini et alet al., 1999)., 1999)

Page 16: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Phylogenetic ProfilingPhylogenetic Profiling

The phylogenetic profile of gene is a string of bits, each bit The phylogenetic profile of gene is a string of bits, each bit indicating the indicating the presence of absence of a homologuepresence of absence of a homologue of the of the gene in a different organism. gene in a different organism.

Once the phylogenetic profiles for all the genes of a given Once the phylogenetic profiles for all the genes of a given organism have been computed, the function of a gene organism have been computed, the function of a gene could be inferred to some extend by examining the could be inferred to some extend by examining the functions of other genes with similar or at least related functions of other genes with similar or at least related profiles profiles (Vert, 2002; Wu (Vert, 2002; Wu et alet al ., 2003) ., 2003)..

The The main limitationsmain limitations of the approach lie in the fact that it of the approach lie in the fact that it could only be applied to complete genomes.could only be applied to complete genomes.

Vert (2002) Vert (2002) propose to map each profile to a high-propose to map each profile to a high-dimensional vector space (feature space), defined in such a dimensional vector space (feature space), defined in such a way that each coordinate in the feature space corresponds way that each coordinate in the feature space corresponds to one particular pattern of inheritance during evolution.to one particular pattern of inheritance during evolution.

Page 17: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

ObjectiveObjective

The main aim of this study is based on The main aim of this study is based on phylogenetic profile method to construct phylogenetic profile method to construct phylogenetic profile thereby we can phylogenetic profile thereby we can identify the homologous proteins across identify the homologous proteins across the genomes and to assign the function the genomes and to assign the function for some uncharacterized proteins for some uncharacterized proteins present in the genome of interest.present in the genome of interest.

Page 18: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Predicted protein coding Predicted protein coding region sequences of the region sequences of the D. D.

radioduransradiodurans genome genome

With a database matchMolecule Identified by

database match

Putative function assigned

Function

Unknown

Conservedhypotheti

cal

No database match

Total

Chromosome I

1812 1211 145 456 821 2633

Chromosome II

255 186 22 47 114 369

Megaplasmid 94 80 9 5 51 145

Plasmid 24 16 5 3 16 40

All 2185 1493 181 511 1002 3187

(White et al., 1999)

Page 19: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

ftp://ftp.ncbi.nih.gov/blast/executables/LATEST/

170 fully sequenced genomes

Page 20: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

RESULTSRESULTS

Altschul Altschul et et alal.,.,

Page 21: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

11

00

Page 22: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Phylogenetic Profile Construction

QUERY PROTEIN GI

NUMBER

DATABASE

TOTAL NO. OF PROTEINS

PRESENCE ABSENCE

Page 23: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

S.No Program Purpose1 Blast_Process.java Program to extract required fields from the blast output

2 New_Compare.java Program to extract homologous proteins from the blast output

3 Prob_Hypo.java Program to construct matrix

4 Hypo_Prob.java Program to extract hypothetical proteins

5 Prob_Min.java Program for hypergeometric distribution

Data AnalysisData Analysis

Page 24: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Hypergeometric DistributionHypergeometric Distribution

We next determined the probability that two proteins have We next determined the probability that two proteins have coevolved; this is based on the similarity of their profiles. coevolved; this is based on the similarity of their profiles.

If we assume that the two proteins A and B do not coevolved, If we assume that the two proteins A and B do not coevolved, we can compute the probability of observing a specific overlap we can compute the probability of observing a specific overlap between their profiles by chance by using hypergeometric between their profiles by chance by using hypergeometric distributiondistribution

N = represents the total number of genomes analyzedN = represents the total number of genomes analyzed n = the number of homologs for protein An = the number of homologs for protein A m = the number of homologs for protein B m = the number of homologs for protein B k´= the number of genomes that contain homologs of both A and Bk´= the number of genomes that contain homologs of both A and B

Because P represents the probability that the protein do not Because P represents the probability that the protein do not coevolve, 1-P (k>k´) is then the probability that they do coevolve, 1-P (k>k´) is then the probability that they do coevolve. We compute this probability for all pairs of protein coevolve. We compute this probability for all pairs of protein within genomeswithin genomes

(Bowers et al., 2004; Wu et al., 2003)

Page 25: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

Hypergeometric Distribution output

Page 26: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

ResultsResultsProbability that two proteins have coevolved 1-P(k>k’) with high confidential valueProbability that two proteins have coevolved 1-P(k>k’) with high confidential value

Page 27: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

ResultsResults

Based on the confidence value obtained by hypergeometric distribution it was predicted that 297 hypothetical proteins out of 1428 hypothetical proteins in D. radiodurans R1 have protein-protein interaction

Page 28: Prediction of Protein-Protein Interactions in Deinococcus radiodurans R1 strain using computational methods Selvakumar E.* Bioinformatics Centre, Pondicherry

ReferencesReferences Altschul, SF, Gish W, Miller W, Myers EW, and Lipman DJ (1990) Altschul, SF, Gish W, Miller W, Myers EW, and Lipman DJ (1990) Basic Local alignment search tool.Basic Local alignment search tool.    J. Mol. BiolJ. Mol. Biol. .

215:403-410. 215:403-410.

Anderson, H. Nordan, R. Cain, G. Parrish and D. Duggan (1956). Anderson, H. Nordan, R. Cain, G. Parrish and D. Duggan (1956). Studies on a radio-resistant Studies on a radio-resistant micrococcus.. Isolation, morphology, cultural characteristics, and resistance to gamma-radiationmicrococcus.. Isolation, morphology, cultural characteristics, and resistance to gamma-radiation. Food Technol. . Food Technol. 1010::575-577.575-577.

Eisenberg, D., Marcotte, E. M., Xenarios, I., Yeates, T. O., (2000) Eisenberg, D., Marcotte, E. M., Xenarios, I., Yeates, T. O., (2000) Protein function in the post-genomic eraProtein function in the post-genomic era. . NatureNature 405405, 823–826., 823–826.

Bowers ,P.M., Pellegrini, M, Thompson M.J, Fierro J, Yeates T.O and Eisenberg D. (2004). Bowers ,P.M., Pellegrini, M, Thompson M.J, Fierro J, Yeates T.O and Eisenberg D. (2004). Prolinks : a database of Prolinks : a database of protein functional linkages derived from coevolution. protein functional linkages derived from coevolution. Genome Biology 5:R35Genome Biology 5:R35 Genome Biol. 2004;5(5):R35. Epub Genome Biol. 2004;5(5):R35. Epub 2004 Apr 16.2004 Apr 16.

Gaasterland T, Ragan MA. Microbial genescapes(1998): Gaasterland T, Ragan MA. Microbial genescapes(1998): Phyletic and functional patterns of ORF distribution among Phyletic and functional patterns of ORF distribution among prokaryotesprokaryotes.. Microb Comp Genomics.,3(4):199-217. Microb Comp Genomics.,3(4):199-217.

Hensel,R. W. Demharter, O. Kandler, R. M. Kroppenstedt,E. Stackebrandt, (1986).Hensel,R. W. Demharter, O. Kandler, R. M. Kroppenstedt,E. Stackebrandt, (1986).Chemotaxonomic and molecular-Chemotaxonomic and molecular-genetic studies of the genus genetic studies of the genus ThermusThermus: evidence for a phylogenetic relationship of : evidence for a phylogenetic relationship of Thermus aquaticusThermus aquaticus and and Thermus Thermus ruberruber to the genus to the genus DeinococcusDeinococcus..IntInt. J. Syst. . J. Syst. Bacteriol. Bacteriol. 36, 444. 36, 444.

Minton K. W., (1994) Minton K. W., (1994) DNA repair in the extremely radioresistant bacterium DNA repair in the extremely radioresistant bacterium Deinococcus radioduransDeinococcus radiodurans. Mol . Mol Microbiol. 1994 Jul;13(1):9-15.Microbiol. 1994 Jul;13(1):9-15.

Pellegrini, M Marcotte, E.M, Thompson M.J, Eisenberg,D and Yeates O.T. (1999). Pellegrini, M Marcotte, E.M, Thompson M.J, Eisenberg,D and Yeates O.T. (1999). Assigning protein functions by Assigning protein functions by comparative genome analysi: Protein phylogenetic profiles comparative genome analysi: Protein phylogenetic profiles Proc.Natl.Acad.Sci.96, 4285-4288.Proc.Natl.Acad.Sci.96, 4285-4288.

Tamames J, Casari G, Ouzounis C, Valencia A: Tamames J, Casari G, Ouzounis C, Valencia A: Conserved clusters of functionall related genes in two bacterial Conserved clusters of functionall related genes in two bacterial genomesgenomes. . J Mol EvolJ Mol Evol1997, 44:66-73.1997, 44:66-73.

White O, Jonathan A. Eisen, John F. Heidelberg, Erin K. Hickey, Jeremy DPeterson,Robert J. Dodson,Daniel H. Haft, White O, Jonathan A. Eisen, John F. Heidelberg, Erin K. Hickey, Jeremy DPeterson,Robert J. Dodson,Daniel H. Haft, Michelle L. Gwinn, William C. Nelson,Delwood L. Richardson, Kelly S. Moffat, Haiying Qin,Lingxia Jiang, Wanda Michelle L. Gwinn, William C. Nelson,Delwood L. Richardson, Kelly S. Moffat, Haiying Qin,Lingxia Jiang, Wanda Pamphile, Marie Crosby, Mian ShenJessica J. Vamathevan, Peter Lam, Lisa McDonald,Terry Utterback, Celeste Pamphile, Marie Crosby, Mian ShenJessica J. Vamathevan, Peter Lam, Lisa McDonald,Terry Utterback, Celeste Zalewski, Kira S. Makarova, L. Aravind, Michael J. Daly Kenneth W. MintonRobert D. Fleischmann, Karen A. Zalewski, Kira S. Makarova, L. Aravind, Michael J. Daly Kenneth W. MintonRobert D. Fleischmann, Karen A. Ketchum, Karen E. Nelson, Steven Salzberg, Hamilton O. Smith, J. Craig Venter, (1999).Ketchum, Karen E. Nelson, Steven Salzberg, Hamilton O. Smith, J. Craig Venter, (1999).Genome Sequence of the Genome Sequence of the Radioresistant Bacterium Radioresistant Bacterium Deinococcus radiodurans Deinococcus radiodurans R1R1. Science 286, 19.. Science 286, 19.

Vert J,Vert J, (2002).(2002). A tree kernel to analyse phylogenetic profilesA tree kernel to analyse phylogenetic profiles. Bioinformatics , 18. S276-S284.. Bioinformatics , 18. S276-S284.

Wu J, Simon Kasif and Charles DeLisis(2003). Wu J, Simon Kasif and Charles DeLisis(2003). Identification of functional links between genes using phylogenetic Identification of functional links between genes using phylogenetic profilesprofiles. Bioinformatics 19 ,1524-1530. Bioinformatics 19 ,1524-1530