detection of novel recombinases in bacteriophage genomes unveils

11
Detection of novel recombinases in bacteriophage genomes unveils Rad52, Rad51 and Gp2.5 remote homologs Anne Lopes 1,2 , Jihane Amarir-Bouhram 3 , Guilhem Faure 1,2 , Marie-Agne ` s Petit 3, * and Raphae ¨ l Guerois 1,2, * 1 CEA, iBiTecS, F-91191 Gif sur Yvette, 2 CNRS, URA 2096, Gif-sur-Yvette, F-91191 and 3 INRA, UMR 1319, Micalis, 78350 Jouy en Josas, France Received July 27, 2009; Revised February 2, 2010; Accepted February 3, 2010 ABSTRACT Homologous recombination is a key in contributing to bacteriophages genome repair, circulariza- tion and replication. No less than six kinds of recombinase genes have been reported so far in bacteriophage genomes, two (UvsX and Gp2.5) from virulent, and four (Sak, Redb, Erf and Sak4) from temperate phages. Using profile–profile compari- sons, structure-based modelling and gene-context analyses, we provide new views on the global land- scape of recombinases in 465 bacteriophages. We show that Sak, Redb and Erf belong to a common large superfamily adopting a shortcut Rad52-like fold. Remote homologs of Sak4 are predicted to adopt a shortcut Rad51/RecA fold and are discovered widespread among phage genomes. Unexpectedly, within temperate phages, gene- context analyses also pinpointed the presence of distant Gp2.5 homologs, believed to be restricted to virulent phages. All in all, three major super- families of phage recombinases emerged either related to Rad52-like, Rad51-like or Gp2.5-like proteins. For two newly detected recombinases belonging to the Sak4 and Gp2.5 families, we provide experimental evidence of their recombina- tion activity in vivo. Temperate versus virulent lifestyle together with the importance of genome mosaicism is discussed in the light of these novel recombinases. Screening for these recombinases in genomes can be performed at http://biodev.extra .cea.fr/virfam. INTRODUCTION The interest for bacteriophages, viruses infecting bacteria, and for viruses in general, is ever-growing with the discov- ery of their ubiquity in various natural ecosystems (1–4), their diversity (5–7), and their probable regulatory role in the equilibrium of bacterial populations (8–11). However, phage diversity also represents a challenge when their genomes need to be annotated. At present, each new genome brings in a majority of unknown predicted proteins (5–7). Large-scale analyses with ACLAME (12,13), a database dedicated to the comparative analysis of microbial mobile elements, has confirmed the abun- dance of orphan proteins in all completely sequenced phage genomes (14). This diversity may appear puzzling when one considers the simplicity of the bacteriophage life cycle, which consists mainly of reproducing its genome and its capsid in a given bacterial host. Phages ought to encode a few basic functions to this end, and one may expect to identify them easily by protein alignments. The fact that the number of unknown phage genes continues to grow suggests either that many phage functions do not derive from a common ancestor or that the current tools for protein alignments are not appropriate. It has been known for a long time that both mutations (15) and recombination rates (16) are much higher in phages than in bacteria, and this may blur the homology signal (17). For this reason, we decided to apply sensitivity- enhanced techniques to investigate in detail one of these families of phage-encoded proteins. We focused on the phage proteins that are responsible for homologous recombination, which is an essential function in numerous phages: it contributes to genome repair (18), to the circularization of genomes with terminal repetitions (19), and to replication (20). When absent, the function *To whom correspondence should be addressed. Tel: +33 1 34 65 28 67; Fax: +33 1 34 65 20 65; Email: [email protected] The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors. Correspondence may also be addressed to Raphae¨l Guerois. Tel: +33 1 69 08 67 17; Fax: +33 1 69 08 92 75; Email: [email protected] 3952–3962 Nucleic Acids Research, 2010, Vol. 38, No. 12 Published online 1 March 2010 doi:10.1093/nar/gkq096 ß The Author(s) 2010. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Downloaded from https://academic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Upload: others

Post on 11-Feb-2022

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Detection of novel recombinases in bacteriophage genomes unveils

Detection of novel recombinases in bacteriophagegenomes unveils Rad52, Rad51 and Gp2.5 remotehomologsAnne Lopes1,2, Jihane Amarir-Bouhram3, Guilhem Faure1,2, Marie-Agnes Petit3,* and

Raphael Guerois1,2,*

1CEA, iBiTecS, F-91191 Gif sur Yvette, 2CNRS, URA 2096, Gif-sur-Yvette, F-91191 and 3INRA, UMR 1319,Micalis, 78350 Jouy en Josas, France

Received July 27, 2009; Revised February 2, 2010; Accepted February 3, 2010

ABSTRACT

Homologous recombination is a key in contributingto bacteriophages genome repair, circulariza-tion and replication. No less than six kindsof recombinase genes have been reported so far inbacteriophage genomes, two (UvsX and Gp2.5) fromvirulent, and four (Sak, Redb, Erf and Sak4) fromtemperate phages. Using profile–profile compari-sons, structure-based modelling and gene-contextanalyses, we provide new views on the global land-scape of recombinases in 465 bacteriophages. Weshow that Sak, Redb and Erf belong to a commonlarge superfamily adopting a shortcut Rad52-likefold. Remote homologs of Sak4 are predictedto adopt a shortcut Rad51/RecA fold and arediscovered widespread among phage genomes.Unexpectedly, within temperate phages, gene-context analyses also pinpointed the presenceof distant Gp2.5 homologs, believed to be restrictedto virulent phages. All in all, three major super-families of phage recombinases emerged eitherrelated to Rad52-like, Rad51-like or Gp2.5-likeproteins. For two newly detected recombinasesbelonging to the Sak4 and Gp2.5 families, weprovide experimental evidence of their recombina-tion activity in vivo. Temperate versus virulentlifestyle together with the importance of genomemosaicism is discussed in the light of these novelrecombinases. Screening for these recombinases ingenomes can be performed at http://biodev.extra.cea.fr/virfam.

INTRODUCTION

The interest for bacteriophages, viruses infecting bacteria,and for viruses in general, is ever-growing with the discov-ery of their ubiquity in various natural ecosystems (1–4),their diversity (5–7), and their probable regulatory role inthe equilibrium of bacterial populations (8–11). However,phage diversity also represents a challenge when theirgenomes need to be annotated. At present, each newgenome brings in a majority of unknown predictedproteins (5–7). Large-scale analyses with ACLAME(12,13), a database dedicated to the comparative analysisof microbial mobile elements, has confirmed the abun-dance of orphan proteins in all completely sequencedphage genomes (14). This diversity may appear puzzlingwhen one considers the simplicity of the bacteriophage lifecycle, which consists mainly of reproducing its genomeand its capsid in a given bacterial host. Phages ought toencode a few basic functions to this end, and one mayexpect to identify them easily by protein alignments. Thefact that the number of unknown phage genes continues togrow suggests either that many phage functions do notderive from a common ancestor or that the current toolsfor protein alignments are not appropriate. It has beenknown for a long time that both mutations (15) andrecombination rates (16) are much higher in phages thanin bacteria, and this may blur the homology signal (17).

For this reason, we decided to apply sensitivity-enhanced techniques to investigate in detail one of thesefamilies of phage-encoded proteins. We focused on thephage proteins that are responsible for homologousrecombination, which is an essential function innumerous phages: it contributes to genome repair (18),to the circularization of genomes with terminal repetitions(19), and to replication (20). When absent, the function

*To whom correspondence should be addressed. Tel: +33 1 34 65 28 67; Fax: +33 1 34 65 20 65; Email: [email protected]

The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.

Correspondence may also be addressed to Raphael Guerois. Tel: +33 1 69 08 67 17; Fax: +33 1 69 08 92 75; Email: [email protected]

3952–3962 Nucleic Acids Research, 2010, Vol. 38, No. 12 Published online 1 March 2010doi:10.1093/nar/gkq096

� The Author(s) 2010. Published by Oxford University Press.This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 2: Detection of novel recombinases in bacteriophage genomes unveils

may be complemented by the host-encoded RecA protein.However, the conservation of this phage-encoded functionsuggests that phages are better served by their ownrecombinases, as laboratory experiments may not reflectthe variety of growth condition with which the phages andtheir hosts are faced in nature. Virulent and temperatephages are thought to contain different recombinases. Sofar, six families of recombinase genes have been reportedin bacteriophage genomes, two (UvsX and Gp2.5) fromvirulent, and four (Sak, Redb, Erf and Sak4) from tem-perate phages, or from ex-temperate phages that havebecome virulent after the loss of a lysogeny module(designated collectively hereafter as temperate). Bothvirulent and temperate phages adopt radically differentrelationships with their hosts. Virulent phages can onlykill them whereas temperate can integrate within theirhost genome. Different temperate genomes can thereforemeet in the same host, and create mosaics by recombina-tion, an event less likely occurring in virulent phages.Recombination processes play therefore a key role inthese various lifestyles and the functions of theserecombinases need to be further uncovered.

Among virulent phages, UvsX, a RecA-like enzymepresent in phage T4, has orthologs in other phages withlarge genomes and strictly virulent lifestyles (25 orthologsin the ACLAME database version 0.3, March 2008, con-taining 465 phage genomes). The virulent phage T7encodes the Gp2.5 recombinase, which shares functionaland structural homology with SSB proteins (21), but incontrast to SSB it also provides a recombinase functionthrough its single-strand annealing activity (22,23). It hasonly seven orthologs in the ACLAME database.

The recombinases identified in temperate phages seemmore diverse although they share common properties.Sak, Redb and Erf, which were delineated as threedistinct families on the basis of their primary sequences(24), harbour a single-strand annealing activity. All threeproteins form similar ring-like quaternary structuresvisible by electron microscopy (25–27). Redb from phage� and RecT from prophage rac are two well-knownmembers of the Redb-like family that are used forgenetic engineering (28,29). Three striking properties dis-tinguish these recombinases from RecA-like enzymes:(i) their ATP-independence, (ii) their high efficiency inpairing short, 40–50 bp long sequences (30) and (iii) theirability to incorporate short single-strand DNA substratesinto the bacterial chromosome (31). In addition, the Redbprotein was recently reported to be 100-fold more efficientthan was RecA for homeologous recombination between22% divergent sequences (16). All these characteristicsmay account for the remarkable plasticity and mosaicismof temperate bacteriophage genomes. Interestingly,the Sak protein shares homology with the N-terminal,globular domain of the eukaryotic Rad52 protein(24,26), and a distant homology between Redb andRad52 has also been reported (32).

It is suspected that other families of recombinases areyet to be discovered. Indeed, most sequenced phagegenomes do not encode homologs of any of the aforemen-tioned proteins (78% of phage genomes in ACLAME). Agenetic study by the group of Moineau has probably

uncovered one such new family called Sak4 (33).The common characteristic of all proteins called ‘Sak’ istheir sensitivity to the plasmid-encoded phage-resistancesystem AbiK (for abortive infection system K). Thisplasmid is present in some Lactococcus lactis bacterialstrains, and confers phage resistance. Sak and Sak2belong to the same family, and Sak3 was identified ashomolog to Erf (33), but Sak4 could not be related toany known recombinase. Up to now, Sak4 has not yetbeen formally proven to be a recombinase, but the factthat sak4 bacteriophage mutants are as resistant to AbiKas are erf or sak recombinase mutants argues strongly forsuch an activity.In summary, with the current protein alignment tools,

at least six families of phage recombinases have beenfound or suspected, namely UvsX, Gp2.5, Sak, Redb,Erf and Sak4. The first two systems, UvsX and Gp2.5,seem quite specific to virulent bacteriophages lifestyle,whereas the four others appear to be more broadlydistributed. Using in-depth bioinformatic methodologies,coupling profile–profile comparisons, structure-basedmodelling and gene-context analyses, we provide accessto the global landscape of recombinases in bacteriophages:based on the global analysis of 465 bacteriophagegenomes archived in the ACLAME database (12,13) weshow that the Sak, Redb and Erf families belong to thesame Rad52 superfamily. We unveil that Sak4 homologsare widespread and correspond to a shortcut version ofRad51. Coupled with gene-context analyses, these large-scale profile–profile analyses further reveal that Gp2.5-likerecombinases are also widespread in bacteriophagegenomes and are not limited to relatives of T7. For twonewly detected recombinases belonging to the Sak4 andGp2.5 families, we provide experimental evidence of theirrecombination activity in vivo. We therefore conclude that,previously unsuspected, a majority of bacteriophagesencode a recombinase function that can be classified intoone of only three superfamilies Rad52, Rad51 or Gp2.5.The discovery of so many recombination systems, whichappear to segregate mainly according to the phage tem-perate or virulent lifestyle, raises fascinating questions asto their DNA repair strategies, their genome diversity andtheir evolution.

MATERIALS AND METHODS

Bioinformatics procedures

All the 465 genomes used for this study were taken fromthe ACLAME database version 0.3, release of March 2008(A CLAssification of Mobile genetic Elements) (12,13).For each protein of the 465 phages, multiple sequencealignments were built using three iterations of PSI-Blast(34) against the nr90 database (non redundant sequencedatabase filtered at 90% identity) and were transformedinto HMM profiles using the HHmake algorithm (35)resulting in a total of 28 300 HMM profiles.The HMM profiles of four recombinases [Sak from ul36

(NP_663647), Redb from 933W (NP_049474), Erf fromD3 (NP_061548) and Sak4 from f31 (AAC48871)] wereused as initial queries to screen the 28 300 profiles of the

Nucleic Acids Research, 2010, Vol. 38, No. 12 3953

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 3: Detection of novel recombinases in bacteriophage genomes unveils

465 genomes to identify homologs using the profile–profilecomparison program HHsearch (v1.6.0), also calledHHpred (35). Accordingly, a HHsearch hit was consid-ered as significant when probabilities for being a truepositive exceeded 90% with both global and local align-ment modes (see Supplementary Figure S1 for justificationof the threshold). The same thresholds of 90% were usedin the next iterations, in which previously detected profileswere in turn compared to the whole-profile database untilconvergence is reached. Sak4 identification furtherrequired validation because it is composed of an ATPasedomain which is a widely represented superfamily. UsingSak4 profile as a query matched not only Sak4 homologsbut also other ATPase such as RecA proteins or DnaBhelicases. Sak4 proteins are shorter than RecA or DnaB,which contain a DNA binding or a DnaB-like domain intheir C-terminal region, respectively. To prevent the selec-tion of any false positive in the assignment of Sak4homologs, only short sequences (less than 290 residues),excluding the possibility of RecA or DnaB C-terminaldomains, were assigned as Sak4.Models of Sak, Redb and Erf undecamers (11-mer),

were generated with Modeller 9v5 (36) using the humanRad52 N-terminal domain as template (PDB:1h2i). Initialalignments between one monomer of each recombinaseand the template were obtained from profile–profile com-parison (35). The sequence divergence between the threerecombinases and the Rad52 template was such thatseveral regions had to be carefully re-aligned in orderto provide sequence alignments consistent with the struc-tural constraints imposed by the Rad52-fold andoptimizing the scores of Verify3D (37) and Prosa (38)methods assessing the 3D models likelihood (see scoresin Supplementary Table S1 and Figure S2). The sameprocedure was followed for generating Sak4 and Gp2.5models using RadB from Thermococcus kodakarensis(PDB:2cvh) and Gp2.5 from phage T7 (PDB:1je5) as tem-plates, respectively. The multiple sequence alignmentswere processed with Jalview (39). Structural models wererepresented using PyMol (40). Conservation analyseswere carried out using the Rate4Site algorithm (41).Profile–profile comparisons against the recombinaseprofiles database can be run and all profiles, models andalignments can be downloaded at http://biodev.extra.cea.fr/virfam/downloads.html.Gene-context analyses were run considering the

20 genes upstream and downstream each of the133 recombinases identified above. Profile–profile compar-isons between the profiles of these genes and the 28 300profiles database were performed as described above.Related genes were clustered together using full linkagehierarchical agglomerative clustering. This procedurewas repeated for all the phage genomes in which norecombinase was detected focusing on the DNA replica-tion modules.

Single-strand annealing assay

In the low-copy number and thermosensitive plasmidpKD46, red� of phage � is cloned under a ParaBpromoter, and its transcription is induced by AraC in

the presence of arabinose (28). The pKD46 backbonewas used to replace the � fragment encompassing gam,red� and red� by uvsX of phage T4, gp2.5 of phage T7or phage f12, or sak4 of phage PA73, each preceded by itsRBS, between the EcoRI and NcoI sites of pKD46. EcoRIand NcoI sites were artificially introduced at the 50 and 30

extremities of the recombinase genes, respectively, byPCR. Site-directed mutagenesis was realized by PCR,designing two 40-bp long complementary oligonucleotidescentred on the mutation to be added to the gene. Left- andright-arm of the gene were then amplified separately, andboth fragments were then combined in a third PCR toamplify the whole gene. All constructions were verifiedby sequencing. Plasmids, strains and bacteriophages usedfor the assay are reported in Supplementary Table S2.

Electro-competent cells were prepared essentially asdescribed by the Biorad manufacturer, except that cellswere grown at 30�C in SLB (5 g/l NaCl, 20 g/l yeastextract, 35 g/l bacto-tryptone and 1ml/l NaOH 5N).Ninety minutes prior harvesting, the recombinase wasinduced by addition of 0.2% arabinose. After 1 h expres-sion of the electroporated cells in SOC medium supple-mented with 0.2% glucose, the cultures were furtherincubated over-night without agitation at room tempera-ture, before plating on LB supplemented with 50 mg/mlrifampicin, and incubating at 37�C. All competent cellshad similar levels of competence of around 2 (±1) � 107

transformants per microgram, as estimated with thepACYC184 plasmid.

The 51-nt long oligonucleotide Maj32 is centred on thenucleotide 1576, starting from the ATG of the rpoB geneof E. coli MG1655, and corresponds to the non-codingstrand. In this oligonucleotide, a G is changed to a Crelative to the wild-type sequence at position 1576, sothat the CAC codon on the coding strand becomesGAC, and the RpoB protein has a H526D mutation,which confers resistance to rifampicin (42). Maj32 wasordered from Eurogentec, as desalted and purified, andresuspended at a concentration of 1mg/ml. Increasingamounts of the oligonucleotide mixture were tested, todetermine the saturating concentration above which nomore transformants could be obtained with the Redbpositive control. Ten micrograms of Maj32 were foundto be saturating.

RESULTS

Remote homology detection of recombinases amongphage genomes

Three distinct families of recombinases, represented byRedb, Erf, Sak proteins, can be distinguished inbacteriophages, according to the classification scheme ofKoonin’s group (24) and a fourth one, Sak4, much lesswell-characterized, has emerged from recent experimentalwork (33). First, the presence of any of these recombinasesamong sequenced bacteriophages was assessed based onthe 465 genomes archived in the ACLAME database(12,13). Profile–profile comparisons relying on algorithmssuch as HHsearch (35), are among the most powerfulmethods to search for homologs among rapidly diverging

3954 Nucleic Acids Research, 2010, Vol. 38, No. 12

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 4: Detection of novel recombinases in bacteriophage genomes unveils

sequences. Here, the efficiency of the approach was furtherenhanced by implementing an iterative and systematicprofile–profile strategy. Every gene of every bacteriophagewas used as a query to build a multiple sequence align-ment so as to create a database of 28 300 profiles.This large ensemble of profiles was screened against eachof the four Redb, Erf, Sak and Sak4 recombinase profilesusing the HHsearch algorithm (35). This global analysisshowed that 126 phages harboured a remote homologof one of the recombinase superfamilies (Figure 1, thickcoloured edges). Among them, 2 Redb-like, 5 Erf-like,21 Sak-like and 40 Sak4-like were found, which had notbeen identified in the ACLAME database as recombinases(circles with thick outline in Figure 1). Using eachof the 126 profiles as queries, a second iteration of pro-file–profile comparisons against the 28 300-profiledatabase was sufficient to reach convergence. Seven addi-tional bacteriophages with Sak or Redb recombinase wereretrieved. Unexpectedly, this second iteration also

revealed that remote homologs of one recombinasefamily could match remote homologs of another fam-ily (shown as black connections between crowns inFigure 1).As reported in Figure 1, three distinct recombinase

families, namely Redb, Erf and Sak, were connectedthrough remote homology relationships, whereas theSak4 group of homologs remained isolated. Theserelationships strongly support the notion in that thethree recombinases families belong in fact to a uniquesuperfamily. This result is all the most surprising thatnone of the direct profile–profile comparisons betweenthe initial Redb, Erf and Sak profiles revealed any signif-icant remote homology signal. Hence, detection of remotehomologies between drastically diverged sequences can beconsiderably enhanced through iterative all-against-allprofile–profile comparisons and search for transitive rela-tionships. In Supplementary Table S3, all retrievedrecombinases are listed.

42e JK06

11b 7201 phiPV83 Pf-WMP3

Lj928 SMP

T1 244

Q54

92 BA3

c2

phi CD119

Wildcat A500 315.4

phiETA3 TLS

HK022

HK97

P22

Lj965

CJW1

EW

phiC2

phiSLT

D3 RTP

MM1 LL-H

VWB

A006

Stx2 cb I 86

B054 Halo

933W

PVL

lambda

0305phi8-36

r1t

Stx2 cb II SPP1 Giles Bcep22

phi LC3 epsilon15

Stx1

phiNM3

A118

phiN315

315.3 bv1

bIL309 EJ-1

phiSG1 VT2-Sa Che9c phi 4795

tp310-1

phiV10 GBSV1

ul36

Tuc2009

TP901-1

CNPH82 F116 PH15

ST104

KC5a 712 53 88

P008

85

80alpha

X2 ST64T

Sf6

bIL285

ES18

Llij

RM 378

bIL170

bIL286

PMC

jj50 YS40

sk1

37

PhiNIH1.1

Sak4HK620

DT1

Sfi11

Sfi19 Sfi21

Lc-Nu 55 77 96

phiJL-1 phi AT3 29 PA73

187 phiPVL(108)

71

69 ROSA

2972

PLot

phiHSIC

PBI1

phi 11

phiNM 52A

phi 13 315.2

315.5 B025 KSY1 tp310-3 BcepGomr

phi ETA BK5-T

phig1e phi adh

O1205

A2

2389

phiETA2

Sak Erf

Redb

0305phi8-36 *

*

Figure 1. Homology relationships between recombinases revealed by iterative profile–profile comparisons. Network of homology relationshipsdetected between the four recombinases superfamilies Redb/RecT (magenta), Erf (orange), Sak (red) and Sak4 (purple). Nodes represent thebacteriophage genomes for which a homologous recombinase was detected. Edges connect recombinases which were found homologous from theprofile–profile comparison using HHsearch. Bold coloured edges correspond to significant hit obtained from the initial screening of the bacteriophage28 300 profiles database with the four recombinase profiles. A second iteration using the detected recombinase profiles as queries revealed additionalrelationships among the same superfamily (light grey edges) and, most importantly, between different superfamilies (black edges). Nodes with a blackoutline pinpoint novel recombinases annotated as unknown function so far. Graph created using the Osprey program (53).

Nucleic Acids Research, 2010, Vol. 38, No. 12 3955

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 5: Detection of novel recombinases in bacteriophage genomes unveils

The structures for three families of recombinases arepredicted to correspond to mini-Rad52

The remote homology between Sak, Erf and Redbproteins implies they adopt related folds and a structuralmodel would provide critical insights into their commonproperties. Fortunately, a homology between Sak familyand the human Rad52 was initially detected throughsequence alignments (24) and further assessed recentlythrough several experimental characterizations (26). Thestructure of the N-terminal domain of human Rad52 wassolved and the domain was found to oligomerize into aring-like shape bearing 11 subunits (43,44). In solution,full-length Rad52 forms homo-oligomers of seven or moresubunits (45). Similarly, electronic microscopy visualiza-tion of Sak proteins revealed that they assemble into11-mer stacked ring structures (26). A structural modelof the monomeric Sak protein and of its oligomericform could be generated using the structure of humanRad52 as a template. With respect to Redb and Erf, astructural model could also be generated based on theabove assessed homology. Structural assessment of themodels guided the optimization of the alignment (see‘Material and Methods’ section) (Figure 2A) and themodel obtained for Redb is compared to the Rad52

N-terminus structure in Figure 2B as a ribbonrepresentation.

The major characteristic of the three recombinases withrespect to Rad52 core structure is to match a shortcutRad52 fold. Significant structural deletions are found inthe main helix (a3) and the three strands b3–b4–b5wrapping around a3 (dashed regions in the secondarystructure cartoon in Figure 2A). Such a deletion in thefold probably accounts for the difficulties encountered sofar in detecting homologies among the Rad52-likesuperfamily. Interestingly, although the fold has beensignificantly rearranged through evolution, some specificfeatures remained conserved in all these recombinases.Figure 2C maps the sequence conservation index at thesurface of the recombinase models as calculated by theRate4site algorithm (41). The clusters of red/orange posi-tions in the groove testify to the specific conservation andfunctional importance of this region. In particular, oneposition, indicated by a red star in Figure 2A–C, isstrictly maintained as a positively charged residue in allthree recombinases. Mutation of the corresponding aminoacid, K152, in human Rad52 was found to fully abrogatesingle-strand DNA binding (43). We mutated the corre-sponding position in Redb (R161) and reached similar

A

RedβRad52 N-ter

β2

β1

β3

α1

β4

α3

β5

241

7736

L C FGQCQY T A E E YQA I QK A L RQR L GP E Y I S S RMAGG - - - - - - - - - GQK V CY I EGHRV I N L A NEMFGY NGWA HS I TQQNV D F V D L N- - - - - - - - - MA DY E EQML A L QK P L QP DRV VWRV QQSGF S KQGK PWAMV L AY MDNRA VQE R F DE V FG I AGWK NE F K T A P DGG - - - -QS RS A E A E F NA SMA AMQS E L P S I A E R - GA I T V NGQK - - - - - - - - - - - RS NY AT F E D I ND I V K P I MQR FGF AV S F RV E T VQT - - - -K CDA S DAQ F I A L L I VA NQYGL NPWT K - E I YA F P DKQN - - - - - - - - - G I V P V VGV DGWS R I I NE NQQF DGMD F EQDNES - - - - - - -

10173

147104

NGK F Y VGV CA F V RV QL K DG - - - S Y HE DVGYGV S EGL K S - - - - - - - - - - - K A L S L E K A RK E AV T DGL K RA L RS FGNA L GN - C I L D -- - - - - - - T L CG I S V K FGDE - - - WV T KWDGA E N TQV - - - - - - - - - - - - - - - - - - - - E AV K GGL SGSMK RA AV QWGV - - GR - Y LY D L- - - - - - GV S V TG I L MHCAGH - - REQT TMLV P L D T SGS - - - - - - - - - - - - - - K NAV QS L GS S V S YGK RY V L S A L L N - - I T - T RGE D- - - - - - - - - C T CR I Y RK DRNHP I CV T EWMDE CRRE P F K T REGRE I TGPWQS HP K RML RHK AM I QCA R L A FGF AG I - - Y DK DE A E -

170125207177

210157255218

- K DY L RS L - - - - - - - - NK L P RQL P L E V D LT K A K RQD L E P S V E E A RY NS CRP T S F AQ T S - - - - - - - - L E K T DGWNK V F DK NS K K N FWWK NPQ - - - - - - - - -DDGNA AV P P K K L I T K AQAQQL K A L L SQC L QD TQ - E A F DA MYGS A EGV P S A- R I V E N TA - - - - - - - - Y T A E RQP E RD I T P V NDE TMQE I N T L L I A L DK TWD

α5α4

β2β1 β3α1 α2

β4 α3β5

B

C

Rad52(1H2I)

Redβ model

Erf model

Sak model

Conservationgrade

R161

R161

K152

K152

Figure 2. Structural models of the recombinases found evolutionarily related to the Rad52 superfamily. (A) Optimal sequence alignment resultingfrom the structural modelling procedure between the sequences of human Rad52 corresponding to the PDB (1H2I), Sak (from phage ul36), Erf (D3),Redb (933W). Secondary structures are indicated on top of the alignment from blue to red colour as in (B and C). Truncated secondary structuresbetween Rad52 and the phage recombinases are indicated by dashed secondary structures. The red star indicates the position of K152 in Rad52whose mutation abrogated single-strand binding. (B) Cartoon representation of the Rad52 crystallized domain and of the Redb structural model withthe red star pinpointing the side-chain of K152 and R161, respectively. (C) Surface representation of Rad52 11-mer oligomeric form as observed inthe crystal structure 1H2I together with the 11-mer models of the three recombinases Sak, Erf and Redb. On the left, the surface is coloured in eitherlight cyan or orange to help visualizing the external and internal ring, respectively. On the right the surface is coloured with respect to theconservation index calculated using Rate4Site program, conservation grade increasing from white to red.

3956 Nucleic Acids Research, 2010, Vol. 38, No. 12

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 6: Detection of novel recombinases in bacteriophage genomes unveils

conclusion (see ‘Experimental validation’ section). Hence,despite their sequence divergence, a remarkable selectionpressure is apparent in the groove, strongly suggestingthat bacteriophage recombinases function similarly toRad52 in binding single-stranded DNA and promotingstrand annealing.

Identification of Sak4 as a mini-Rad51

Sak4, the fourth recombinase, remained isolated inFigure 1 and could not be related to any feature of theRad52 fold. In contrast, comparison of Sak4 profileagainst a database of PDB-derived profiles, revealed anunambiguous homology to the RecA-RadA-Rad51family with the highest probability for the structure ofthe archeal RadB paralog. Sak4 only shares 15%identity with RecA and Rad51, but bears both WalkerA and Walker B motifs involved in ATP binding, asshown in the alignment of Rad51 paralogs in Figure 3A.A striking feature of Sak4 is its small size, with 245residues in bacteriophage f31 representing only two-thirds of the RecA or Rad51 proteins.

Our analysis brought to light 40 bacteriophagesharbouring the Sak4 homolog and all of them sharedthis property of reduced size (Supplementary Table S3).How can the RecA or Rad51 fold tolerate such a drasticsize reduction? First, only the region of the ATPasedomain matched the Sak4 profile leaving no room for asecondary DNA binding domain as found in theRad51 N-terminus and RecA C-terminus (Figure 3).Furthermore, specific deletions of helical turns need tobe considered to permit a correct alignment (helices a1,a3 and a4). A structural model accounting for these dele-tions could be reliably generated illustrating the architec-ture of Sak4 as a mini-RecA protein (Figure 3B). Theabsence of the secondary DNA-binding domain is alsofound in other Rad51 paralogs such as RadB in Archae,whose specific function with respect to RadA remainsunclear (46), or XRCC2, which is probably a Rad51cofactor (47). In the RecA protein, the secondary DNAbinding domain is proposed to facilitate strand-exchangeby retaining the displaced strand away from theheteroduplex (48). It may be, therefore, that the Sak4

1111111111

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - MA I D EN KQ K A L A A A L GQ IMM I MG K K S SD PAV V E I NDV D E L E L E VG E E V T S K K K K K E K E I R T I ED L PG VG PAT A E K L R E AG F D T L E A I AVA S P I E L K E VA G I S EG A A L K I I Q A AR K A- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - MAMQMQ L E AN AD T S V E E E S F G PQ P I SR L EQCG I N AND V K K L E E AG F H T V E AVAYA P K K E L I N I KG I S E A K AD K I - L A E A A K- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - MG S K K L KR VG L SQ E L CDR L SRHQ I L TCQD F L C L S P L E L MK V T G L S YRG VH E L - L CMV SR- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - MRG K T F R F EMQRD L V S F P L S PAV R V K L V S AG F Q T A E E L L E V K P S E L S K E VG I S K A E A L E T L Q I I RR E- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - MG V L R V G L C PG L T E EM I Q L L R SHR I K T V VD L V S AD L E E VA Q KCG L S Y K A L VA L RR V L L AQ- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - MC S A F H- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - MD L D L L D L N PR I I A A I K K A K L K S V K E V L H F SG PD L KR L T N L S S P E VWH L L R T A S L H

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ME I K K A S E L E KG SN F S A L I YE KQ F G KG S I MR L G EDR SMDV - - - E T I S TG S L S L D I A L G AGG L PMGR I V E I YAN L G T F MR AD E Y L K KR E S I - - - - GR I S TG S K S L D K L L G - GG I E TQ A I T E V F- - - - - - MR A S K Y E A KG - - - - - - - MT L T TG V KG L D E L L G - GG VA RG V I L Q V YL V PMG F T TA T E FHQRR S E I - - - - I Q I T TG S K E L D K L L Q - GG I E TG S I T EMFAC A P KMQ TA YG I K AQR S AD F S P - A F L S T T L S A L D E A L H - GG VA CG S L T E I TS E SH K KC TA L E L L EQ EH T Q - - - - G F I I T F C S A L DD I L G - GG V P L MK T T E I CF S A F P VNG AD L Y E E L K T S T - - - - A I L S TG I G S L D K L L D - AG LY TG E V T E I VR A E SG T E L L AR L EGR S S L K - - - - E I E PN L FA D ED S P VH - - - - - - GD I L E F HL RG S S I L T A L Q L HQQ K ER F P TQHQR L S L GC P V L D A L L R - GG L P L DG I T E L A

Sak4 (Phi31)RecA (E. coli)

RadA (P. abyssi)RadB (P. abyssi)

Rad51 (H. sapiens)Rad51B (H. sapiens)Rad51C (H. sapiens)Rad51D (H. sapiens)

Xrcc2 (H. sapiens)Xrcc3 (H. sapiens)

RecA (E. coli)RadA (P. abyssi)RadB (P. abyssi)

Rad51 (H. sapiens)Rad51B (H. sapiens)Rad51C (H. sapiens)Rad51D (H. sapiens)

Xrcc2 (H. sapiens)Xrcc3 (H. sapiens)

Sak4 (Phi31)

Sak4 (Phi31)RecA (E. coli)

RadA (P. abyssi)RadB (P. abyssi)

Rad51 (H. sapiens)Rad51B (H. sapiens)Rad51C (H. sapiens)Rad51D (H. sapiens)

Xrcc2 (H. sapiens)Xrcc3 (H. sapiens)

Walker A Walker B

DNA Binding Domain Rad51/RadA

DNA Binding Domain RecA

β2

β1

β3α1 α2 β5

β6 β7 β8

α3 α4 α5 α6

α7

2167

14538

127108125107

48107

A P PG T G K T S T I K Y L PGR - - - - - - - - - - - - - T L V I DV DR T T N V L SG E EN - - - -G P E S SG K T T L T L Q V I A A AQR E - - - - - - G K TC A F I D A EH A L D P I YA R K L G - - -G E FG SG K T Q L AH T L AV MVQ L P P E EGG L NG S V I W I D T EN T F R P ER I R E I A KNRG P FA TG K T T FA MQ VG L L N - - - - - - - - - EG K VAY V D T EGG F S P ER L KQMA E SRG E FR TG K T Q I CH T L AV T CQ L P I DRGGG EG K AMY I D T EG T FR P ER L L AVA E R YG P PGCG K T Q F C I MMS I L A T L P T NMGG L EG AV V Y I D T E S A F S A ER L V E I A E SRG A PG VG K T Q L CMQ L AV DV Q I P EC F GG VA G E AV F I D T EG S FMVDR V VD L A TA CGG PG SG K T Q VC L CMA AN V AH - - - - - G L QQN V L Y VD SNGG L TA SR L L Q L L Q A KG P EG TG K T EMLY H L TA RC I L P K S EGG L E V E V L F I D T DY H F DML R L V T I L EHRGR S S AG K TQ L A L Q L C L A VQ F PRQHGG L E AG AV Y I C T ED A F PH KR L QQ L MAQQ

- - - - - - I D I V YA D I NN V E VG F A KML E E I HD - - -- - - - VD I DN L L C SQ PD TG EQ A L E I CD A L AR - - -G L D PD E V L KH I Y VA R A FN SNHQML L V QQ A ED K IG L D P E K A L S K F I I F E PMD L N EQRR I I S K L K T V VG L SG SD V L DN VAYA R A FN T DHQ TQ L LY Q A S AMME E K L L L T S S K VH L YR E L TCD E V L QR I E S L E E E ID F T L DN I L SH I Y Y F RCRD Y T E L L AQ V Y L L PD F LE E EQ A E A L RR I Q V VH A FD I F QML D V L Q E L RG T VE E I I K YC L GR F F L V YC S S S T H L L L T L Y S L E SMFL L Q K L R FG SQ I F I EH V AD VD T L L EC VN K K V P V L

EH I QN YDN I V I DN L S E L EQ AWL G E K A- - SG AV DV I V VD S VA A L T P K A E I EG E- T D K P V K L L I VD S L T SH F - R S E Y I G -- - SD K F S L V V VD S L T AH Y- R A E - - - -- V E SR YA L L I VD S AT A L Y- R T DY SG -- I S KG I K L V I L D S VA S V V - R K E F D A -S EH S K VR L V I VDG I A F P F - RHD L DD -G S SG T V K V V V VD S V T AV V - S P L L GG -C SH P S L C L L I L D S L S A F Y - - - W I DR VL SRGMAR L V V I D S VA A P F - RC E FD S -

P EMGDY N K F S F Y L PN L I R Y VN SWPN V - - N K V Y T AWE T TI GD SHMG L A ARMMSQ AMR K L AGN L KQ SN T L L I F I NQ - I- - RG A L A ERQQ K L A KH L AD L HR L AN L Y E I AV F V T NQ - V- - - - - G SRDH V E L A KQ L Q V L QWL AR K KN VAV I V VNQ - V- - RG E L S ARQMH L AR F L RML L R L AD E FG VAV V I TNQ - VQ L QGN L K ERN K F L AR E A S S L K Y L A E E F S I P V I L TNQ - I- - - - - L S L R TR L L NG L AQQM I S L ANNHR L A V I L T NQ - M- - - - QQR EG L A L MMQ L AR E L K T L ARD L GMAV V V T NH - INGG E S VN L Q E S T L R KC SQC L E K L VNDY R L V L FA T T Q - T- - - Q A S A PR ARH L Q S L G A T L R E L S S A F Q S P V L C I NQ - V

155197293165270260287252198260

R E I K L - - - - - -RMK I G VMF - - -Q AR PD A F F - - -Y YD SN - - - - - -VA Q VDG A AMF AT T H L SG A L A SQT T K I DRN - - - -T RDRD - - - - - -I MQ K A S S S S E ET E AME EQG A AH

DQ A I PQ I R E K I I TN VMG L VNMVGR L V I N E E T GNRG F - - I L T P SN A T FA KNQ L - SD A K F A KQ E E I WQ F K S E V K E T PD E T P KQ AR - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -GN P E T T T GGN A L K F YA S V R L D I RR I G AV K E G EN V V - - - - G S E T R V K V V KN K I A - - - A P F KQ A E F Q I LY G EG I N F YG E L VD L G V K E K L I E K AG AWY S Y KG E K I GQG K AN A TA WL KDN P E T A K E I E K K VR E L L L SN PN S TGD P T R P I GGH I L AH S AT L R V Y L R KG KGG - - - - - - - - - - - - - - KR VA R L I D A PH - - - L P EG E AV - F R I T E KG I ED - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -TN T L R P I A EH T L G YR T KD I L R F E K F R V G - - - - - - - - - - - - - - VR L AV L ERHR F - - - R P EGG I V Y F K I T D KG I EDV L K A K P ENG E EN I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -AD P K K P I GGN I I AH A S T TR LY L R KGRG E - - - - - - - - - - - - - - TR I C K I YD S PC - - - L P E A E AM - FA I N ADG VGD A KD - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -S SC V I A A L GN TWSH S VN T R L I L Q Y L D S E - - - - - - - - - - - - - - RRQ I L I A K S P L - - - A P F T S F V - Y T I K E EG L V L Q E T T F C S V T Q A E L NWA P E I L P PQ P P EQ L G L QMCHH TQ L I F - - - - - - - - - - - - - - - - - - - - - - - -Q A L L V PA L G E SWGH A AT I R L I FHWDR KQ - - - - - - - - - - - - - - - R L AT L Y K S P S - - - Q K EC T V L - FQ I K PQG F RD T V V T S AC S L Q T EG S L S TR KR SRD P E E E L - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -SGR L K PA L GR SWS F V P S T R I L L D T I EG AG A SGG - - - - - - - - - RRMAC L A K S SR - - - Q P TG FQ EMVD I G TWG T S EQ S AT L QGDQ T - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -D I DY R P Y L C K AWQQ L V KHRMF F S KQDD SQ S S - - - - - - - - - - - NQ F S L V SRC L K - - - SN S L K KH F F I I G E SG V E F C - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -D ER V S PA L G I TWANQ L L V R L L ADR L R E E E A A L GC P - - - - - - - AR T L R V L S A PH - - - L P P S SC S - Y T I S A EG VRG T PG T Q SH - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

RecA(1u94)

Rad51(1szp)

RadB(2cvh)

Sak4(model)

a2 a1

C-ter DNA binding domain N-ter DNA binding domain

B

A

Figure 3. Structural Model of Sak4, a shortcut homolog of RecA/Rad51 superfamily. (A) Comparison between Sak4 from Lactococcus phage �31(gb:AAC48871) and other members of the RecA/RadA/Rad51 superfamily, E. coli RecA (B7N6S9), Pyrococcus abyssi RadA (Q9V233) and RadB(Q9V2F6), human Rad51 (Q06609), Rad51B (O15315), Rad51C (O43502), Rad51D (O75771), XRCC2 (O43543) and XRCC3 (O43542). Secondarystructures are indicated on top of the alignment from blue to red; pink and purple bars indicate the position of additional domains. (B) Ribbonrepresentation of the structural model of Sak4 from Lactococcus phage �31 compared to that of RadB (pdb 2cvh), RecA (pdb 1u94) and Rad51(pdb 1szp) X-ray structures.

Nucleic Acids Research, 2010, Vol. 38, No. 12 3957

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 7: Detection of novel recombinases in bacteriophage genomes unveils

proteins have lost (or have never acquired) the capacity topromote strand exchange, and maintain only asingle-strand annealing activity.

Alternative recombinase systems revealed by combiningprofile–profile and gene-context analyses

The architecture of the bacteriophage genomes oftenexhibits a conserved arrangement of gene functions. At afirst level, genomes can typically be segmented intomodules comprising either head or tail assembly factors,DNA replication machinery or cell lysis factors. Analysisof the gene neighbours for the 133 recombinases detectedeither as mini-Rad52 (Redb, Erf, Sak) or as mini-Rad51(Sak4) analysis confirmed that the recombinase was locatedwithin a replication module (Supplementary Figure S4,Tables S4 and S5). Furthermore, we found examples ofsubstitutions between the different recombinase familiesas illustrated in Figures 4A and Supplementary FigureS5. Can the pattern of genes surrounding theserecombinases be used further to identify alternativerecombinase systems in other phage genomes?A key step in the gene-context analysis remains the

identification of homologous proteins in the vicinity ofthe recombinases genes. Here again, high sequence diver-gence among the neighbouring sequences may limit theanalysis. We pushed back the efficiency of the homologydetection among the neighbours of the recombinasesby applying the profile–profile comparisons protocolpreviously described, focusing on the genes in the DNAreplication module. In particular, we looked for orthologsof RecE, a 50–30-exonuclease which prepares the single-strand substrate for the recombinase to performthe strand annealing. This protocol revealed 84 genomescontaining a recE-like homolog, while only 26 could bedetected through PSI-Blast analyses (yellow and redstriped boxes in Figure 4A). However, among these84 genomes, only 30 contained both a RecE-likeexonuclease and a recombinase (Redb, Erf or Sak4)(shown as pink or dark pink in Figure 4A), suggestingthat some other recombinase family may be hidden inthe vicinity of the remaining genomes.Indeed, in 19 genomes, a specific set of homologous but

unknown proteins was detected, each of them clusteringwith a recE-like gene. One example of such a protein,found in bacteriophage f12, is indicated by a white boxoutlined in black in Figure 4A. Comparing the profiles ofthese 19 proteins against a database of PDB-derivedprofiles, revealed an unambiguous homology to the geneprotein 2.5 (Gp2.5) of bacteriophage T7. Gp2.5 possessesa strand annealing activity which contributes to geneticrecombination during growth of T7 phage (22,23). Thestructure of Gp2.5 from phage T7 showed that itcontains an OB-fold (21) but differs from other SSBs bythe presence of the long a1 helix represented in Figure 4B.The R82C mutation (indicated by a red star) located justdownstream of this helix specifically abrogated thesingle-strand annealing activity suggesting that this extrahelix is important for this property (49). In the Gp2.5-likeprotein of phage f12, the long helix is also predicted and aconserved basic residue is found at the position

corresponding to R82 in Gp2.5 of phage T7. Hence, notonly the existence of the helix but also the conservation ofimportant positions suggest that protein Phi12P11(NP_803317) of phage f12 belongs to the specific Gp2.5superfamily rather than to the larger SSB family.

PriA

P335Sak4

bIL309 TLS Erf Redβ

Φ 12recombinase ?

A

recErecombinase

dut

inihel loader

SF2 hel

DNA polssb

int

rep

module?

Gp2.5 (PDB:1je5) (phage T7)

model Gp2.5-like (phage Φ12)

SSB (PDB:1eyg) (E. coli)

B

α1

β1

β2

β3β4

β5

K79R82

Figure 4. Analysis of the gene context in the DNA replication andrecombination module of several bacteriophages. (see alsoSupplementary Figure S5) (A) Phage names are indicated on top withthe type of detected recombinase annotated below. ORFs are repre-sented by ordered boxes with size related to the ORF length. Lightand medium grey boxes indicate genes not considered in the analysis.The different predicted recombinases are colored as pink. Othercolours, when identical, pinpoint genes sharing a homology relationship(either remote or close). Genes surrounding the homolog of 50–30

exonuclease RecE (yellow/red stripes) in P335 sensu lato (in fact,phage 4268), bIL309, TLS and f12 phages. A Rad52 (pink box) orRad51-like (dark pink box) recombinase is often found in a closeneighbourhood of RecE, suggesting that in f12 (the white genelabelled with a question mark) could itself be a recombinase. Someputative functions have been labelled by a short-term, ini, initiator;hel loader, DNA-C type helicase loader; ssb, single-strand binding;RecE, 50–30 exonuclease RecE; int, phage integrase; repr, repressor;DNA pol, DNA polymerase (Pol I-type); dut, dUTPase; endo,endonuclease; SF2 hel, superfamily 2 helicase; PriA, DNA primase.(B) Structural model of the Gp2.5-like protein identified in phagef12. From left to right, ribbon representation of (i) the structure ofGp2.5 protein (phage T7) (PDB code: 1je5), (ii) the structural model ofGp2.5 distant homolog from phage f12 and (iii) the structure of thesingle-strand binding (SSB) protein of E. coli (PDB code: 1eyg). Redstars in Gp2.5 structures pinpoint residues whose mutation abrogatedsingle strand annealing activity [in (49) and in this study].

3958 Nucleic Acids Research, 2010, Vol. 38, No. 12

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 8: Detection of novel recombinases in bacteriophage genomes unveils

As with Rad51- or Rad52-like recombinases, remotehomologs of Gp2.5 are not always associated with RecEand the profile–profile comparison revealed three addi-tional Gp2.5. All in all, 30 phages were found to containa Gp2.5-like recombinase (Supplementary Table S3),underscoring the importance of this class of recombinase,besides the Rad52- and Rad51-like proteins.

Predicted Sak4 and Gp2.5 recombinases havesingle-strand annealing activities in vivo

It has been reported that various phage recombinases ofthe Rad52-like superfamily are able to promotesingle-strand annealing in vivo in E. coli (50). The assayis based on a 70-nt long oligonucleotide complementary tothe gal gene that reverts cells to a Gal+ phenotype.Depending on the recombinase tested, yields ofrecombinants varied by a factor of 3000 with Redbgiving the higher ratio of recombinants (2� 10�1 Gal+per viable cell), and GP35 of a Bacillus subtilis prophagethe lower ratio (6.5� 10�5) (50). We used a similar assayto determine whether the predicted recombinases of theSak4 and Gp2.5 families were able to recombine DNAin vivo. To do so, we used a 51-nt long single-strandoligonucleotide complementary to the rpoB gene ofE. coli, except for a single point mutation that confersresistance to rifampicin. We then tested whether thepresence of one of the predicted recombinases allowedthe oligonucleotide integration into the E. coli chromo-some, upon transformation by electroporation. TheE. coli recipient strain was chosen to be defective forrecA, as it is known that RecA is not needed for such areaction, and proficient for mismatch repair, to minimizethe background level of spontaneous rifampicin resistant(RifR) clones. The rpoB mutation was a C:C mismatch,which escapes from the MutLSHU repair system. We havetested that in vivo the five recombinases are produced atsimilar amounts, except for Gp2.5 from T7, which isproduced at 5–10 times the level of the four other ones(Supplementary Figure S6).

With Redb recombinase (of the Rad52 superfamily), ayield of 2� 10�3 RifR per viable cells was obtained atsaturating amounts of oligonucleotide (Figure 5, seeSupplementary Table S7 for exact numbers). Next, thefull size RecA-like enzyme of phage T4, UvsX and thepredicted minimal Rad51-like protein of the Sak4 familyfrom phage PA73 (infecting Pseudomonas aeruginosa),were compared. Both recombinases produced 1–2� 10�6

RifR transformant per cells, 2000-fold less than Redb.Nonetheless, even this lower frequency was significantlyhigher than the background level of RifR recombinantsproduced with the control plasmid devoid of recombinase,which was on average at 4.9� 10�7 per viable cell. Becausethe Gp2.5 family has various distant homologs, twomembers of the family were tested, those of phages T7and f12 (a prophage of Staphylococcus aureus). A yieldof 10�5 RifR clones per viable cells was obtained with theT7 protein, and 10-fold less with the f12 protein. Thehigher yield observed with the T7 protein compared tothe f12 protein may be due to its higher expressionlevel. We conclude therefore that the predicted

recombinases of phages PA73 and f12 promotesingle-strand annealing in vivo, albeit with a reduced effi-ciency with respect to Redb. The reduced efficiency may berelated to the heterologous expression of these proteins inE. coli, and more complete studies are required to charac-terize fully these new recombinases.Alignments of recombinases of the Rad52 family high-

lighted a conserved arginine/lysine residue across allsub-families (Figure 2, red star). A mutation convertingthis arginine into a cystein (R161C) in the Redb gene ledto a 1000-fold reduction of in vivo annealing efficiency(Figure 5 and Supplementary Table S7), confirming thecritical role of this positively charged residue in vivo. Thestructure function analysis of Gp2.5 revealed the R82Csubstitution as affecting specifically the single strandannealing activity of the protein in vitro (49). A substitu-tion affecting the lysine residue located at the equivalentposition in the gp2.5 gene of phage f12 (K79C) was there-fore tested, as well as a neighbouring position,more conserved, R83C (Figure 4B and SupplementaryFigure S3). Both protein variants produced a yield ofrecombinants undistinguishable from the backgroundvalue. This confirmed the importance of both residues inthe f12 protein, as well as the significance of the lowactivity detected for the wild-type f12 protein. Mutantsin the Walker A motif of the ATPase of UvsX and Sak4proteins were done, and interestingly the recombinationactivity of the Sak4 mutant was not affected at all,in contrast to the UvsX mutant (Figure 5 andSupplementary Table S7). The property of Sak4 is

Recombinants RifR/viable

1.E-07

1.E-06

1.E-05

1.E-04

1.E-03

1.E-02

Redβ

Redβ R

161C

Gp2.5

T7

Gp2.5

φ12

Gp2.5

φ12 K

79C

Gp2.5

φ12 R

83C

UvsX

Sak4

**

UvsX K

66A

Sak4

K30A

No re

com

binas

e1.E-08

**

**

**

* ***

Figure 5. Single-strand annealing activities in vivo of the threesuperfamilies of recombinases. Efficiency of recombination is estimatedby the ratio of rifampicin-resistant recombinants (generated by integra-tion of the Maj32 single-strand oligonucleotide into rpoB), per viablecells. Each value corresponds to the average of at least three experi-ments, performed in an E. coli AB1157 derivative in which a givenrecombinase (x-axis) has been induced from a low-copy numberpSC101 plasmid derivative. Spontaneous rifampicin-resistant mutantswere obtained at a median frequency of 2� 10�8. To ascertain theeffect of each recombinase, as compared to the empty plasmid vector(‘No Recombinase’), a Student test was performed: double asterisksindicate significance at the 1% level, single asterisk indicate significanceat the 5% level.

Nucleic Acids Research, 2010, Vol. 38, No. 12 3959

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 9: Detection of novel recombinases in bacteriophage genomes unveils

reminiscent of the behaviour reported for XRCC2, a shortversion paralog of Rad51 which regulates the length ofgene conversion tracts in vertebrate cells, irrespective ofits ATPase activity (51).

DISCUSSION

Three major superfamilies of phage recombinases

Our systematic analysis of 465 bacteriophages combiningsensitivity-enhanced homology searches, structural model-ling and gene contextual analyses revealed many insightsinto the nature of the homologous recombination systemsin bacteriophages (Supplementary Table S3 and FigureS7). First, we were able to draw a unified picture formost recombinases experimentally studied so far. Weassessed with high confidence that the Rad52 fold hasbeen substantially rearranged and shortcut throughoutbacteriophage evolution, giving rise to a wide variety ofrecombinases that were previously thought to belong todifferent families (24). This feature has also been high-lighted in two recent works for Sak (26) and Redb (32).In our work, careful structural modelling leads tooptimized alignments. As a result, our Redb/Rad52 align-ment (Figure 2A) predicts R161 as a hot functionalresidue in Redb, whereas this residue was not alignedwith conserved positions in (32).Strongly supported by gene-context analyses, we further

discovered two unsuspected recombinase systemsevolutionarily related either to Rad51 or to Gp2.5superfamilies. Most members of the Rad51-like familywere orthologs of Sak4, which is again a shortcutversion of Rad51, and substituted extensively for Rad52homologs in similar gene contexts. Experimentally, wevalidated that newly identified members of theRad51-like and Gp2.5-like superfamilies do act asrecombinase in vivo. Point mutations abrogating Redb(R161C) and Gp2.5 (R79C, R83C) activity were success-fully designed and validate the alignments optimized withthe help of the structural templates. The huge majority ofgenomes have at most one recombinase gene. This almostexclusive presence likely reflects their similar tasks in allthese bacteriophages (Supplementary Table S3). The fourexceptions (Phages CJW1 and 244 from Mycobacteriumsmegmatis, 0305phi8-36 from Bacillus cereus, and YS40from Thermus thermophilus) include cases where aRad51-like gene is combined with a Rad52-like gene, asituation reminiscent of the eukaryotic kingdom.

Recombinase distribution correlates with genome size

Are these recombinases found in certain classes of bacterio-phages or are they randomly distributed? As regardsgenomes size, all genomes of <20 kb (140 genomes),including single-strand DNA genomes and RNAgenomes, were devoid of recombinase. In the 20–80 kbrange, 160 genomes out of 286 were identified with arecombinase from either the Rad52-like, Sak4-like orGp2.5-like superfamily. Rad52-like and Sak4-likeare almost exclusively integrated in temperate phages(P-value of 5.5�10�5 and 1.2�10�4, respectively), whereasGp2.5-like are spread among both temperate and virulent

phages (Supplementary Figure S8B and Table S3). Withinthe 39 genomes of >80 kb, 24 have a UvsX/RecA-likerecombinase, whereas only one have a Rad52-like one. Incontrast to Sak4, UvsX/RecA recombinases are signifi-cantly associated with virulent phages (P-value of 2�10�6).The presence of a recombinase is therefore stronglycorrelated with the phage virulence or temperate character(Supplementary Figure S8B and Table S3). Overall, 57%ofthe genomes >20 kb, have now a recombinase.

Whether large genomes with no detected recombinase(Supplementary Table S6) are really devoid of recom-binase function, or encode an undetected one remains anopen question. Phage genomes with terminal repeats, ifbeing really orphan of a recombinase gene are notexpected to grow in a recA host. A way to screen forputative new families of recombinase in the future willconsist in testing whether orphan phages with terminalrepeats, especially those encoding the recE-like accessorygene, grow in recA hosts.

Temperate phages and mosaicism

As mentioned in the ‘Introduction’ section, a major differ-ence between UvsX/RecA and a recombinase such asRedb lies in the fidelity and the stringency of the recom-bination process. Redb was shown 100-fold more efficientthan was RecA for homeologous recombination between22% divergent sequences (16). We proposed earlierthat a manifestation of this reduced stringency in Redb-containing phages may be the existence of genomemosaicism characterized by the presence of dispersedand almost identical segments recently exchangedbetween divergent phages infecting the same host (16).In contrast, comparative genomics of virulent T4-likegenomes (which contain UvsX) gives no evidence forsuch mosaicism (52). The fidelity of Sak4-likerecombinases is not known at present, but our previousstudy on mosaicism reveals that the nine Staphylococcusaureus phages (16) encompassing a Sak4-like recombinase(namely phages 187, 69, 77, 96, ROSA, 71, 55, 29 and52A) have a high density of mosaics (an average of10 per comparison), as compared to the genomes encom-passing a Rad52-like protein (namely phages 53, 85, 42e,37, EW, 88, 92, X2; 6.4 mosaics per comparison inaverage). Accordingly, Sak4 recombinases might behavesimilarly to Redb towards homeologous recombination.This is unexpected, given the high fidelity of RecA,which belongs to the same superfamily as Sak4.

Hence, two members of the Rad51 superfamily,Sak4 and UvsX/RecA may be associated with verydistinct genomic structures related to the fidelity ofthe recombination process and existence of mosaicism.Comparing the capacity of Sak4 and UvsX proteinsto promote homeologous recombination in vivo willtell whether these proteins effectively behave differently,and which domain or sequence motifs contribute tothe distinct features. Along the same lines, it will be inter-esting to further probe whether the various Gp2.5,quite evenly distributed among virulent and temper-ate phages, also play distinct roles with respect tomosaicism.

3960 Nucleic Acids Research, 2010, Vol. 38, No. 12

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 10: Detection of novel recombinases in bacteriophage genomes unveils

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

ACKNOWLEDGEMENTS

We are grateful to Raphael Leplae for providing phageclassification data and to Johannes Soding for providingthe latest version of the HHsearch algorithm. We thankColin Tinsley and Xavier Veaute for critical reading of thearticle and the two anonymous referees for their helpfulcomments and suggestions.

Conflict of interest statement. None declared.

REFERENCES

1. Desnues,C., Rodriguez-Brito,B., Rayhawk,S., Kelley,S., Tran,T.,Haynes,M., Liu,H., Furlan,M., Wegley,L., Chau,B. et al. (2008)Biodiversity and biogeography of phages in modern stromatolitesand thrombolites. Nature, 452, 340–343.

2. Dinsdale,E.A., Edwards,R.A., Hall,D., Angly,F., Breitbart,M.,Brulc,J.M., Furlan,M., Desnues,C., Haynes,M., Li,L. et al. (2008)Functional metagenomic profiling of nine biomes. Nature, 452,629–632.

3. Fierer,N., Breitbart,M., Nulton,J., Salamon,P., Lozupone,C.,Jones,R., Robeson,M., Edwards,R.A., Felts,B., Rayhawk,S. et al.(2007) Metagenomic and small-subunit rRNA analyses reveal thegenetic diversity of bacteria, archaea, fungi, and viruses in soil.Appl. Environ. Microbiol., 73, 7059–7066.

4. Schoenfeld,T., Patterson,M., Richardson,P.M., Wommack,K.E.,Young,M. and Mead,D. (2008) Assembly of viral metagenomesfrom yellowstone hot springs. Appl. Environ. Microbiol., 74,4164–4174.

5. Kwan,T., Liu,J., DuBow,M., Gros,P. and Pelletier,J. (2005) Thecomplete genomes and proteomes of 27 Staphylococcus aureusbacteriophages. Proc. Natl Acad. Sci. USA, 102, 5174–5179.

6. Kwan,T., Liu,J., Dubow,M., Gros,P. and Pelletier,J. (2006)Comparative genomic analysis of 18 Pseudomonas aeruginosabacteriophages. J. Bacteriol., 188, 1184–1187.

7. Pedulla,M.L., Ford,M.E., Houtz,J.M., Karthikeyan,T.,Wadsworth,C., Lewis,J.A., Jacobs-Sera,D., Falbo,J., Gross,J.,Pannunzio,N.R. et al. (2003) Origins of highly mosaicmycobacteriophage genomes. Cell, 113, 171–182.

8. Bossi,L., Fuentes,J.A., Mora,G. and Figueroa-Bossi,N. (2003)Prophage contribution to bacterial population dynamics.J. Bacteriol., 185, 6467–6471.

9. Brown,S.P., Le Chat,L., De Paepe,M. and Taddei,F. (2006)Ecology of microbial invasions: amplification allows virus carriersto invade more rapidly when rare. Curr. Biol., 16, 2048–2052.

10. Lindell,D., Jaffe,J.D., Coleman,M.L., Futschik,M.E.,Axmann,I.M., Rector,T., Kettler,G., Sullivan,M.B., Steen,R.,Hess,W.R. et al. (2007) Genome-wide expression dynamics of amarine virus and host reveal features of co-evolution. Nature,449, 83–86.

11. Pal,C., Macia,M.D., Oliver,A., Schachar,I. and Buckling,A. (2007)Coevolution with viruses drives the evolution of bacterialmutation rates. Nature, 450, 1079–1081.

12. Leplae,R., Hebrant,A., Wodak,S.J. and Toussaint,A. (2004)ACLAME: a CLAssification of Mobile genetic Elements.Nucleic Acids Res., 32, D45–D49.

13. Leplae,R., Lima-Mendez,G. and Toussaint,A. (2010) ACLAME: aCLAssification of Mobile genetic Elements, update 2010. NucleicAcids Res., 38, D57–D61.

14. Lima-Mendez,G., Toussaint,A. and Leplae,R. (2007) Analysis ofthe phage sequence space: the benefit of structured information.Virology, 365, 241–249.

15. Drake,J.W., Charlesworth,B., Charlesworth,D. and Crow,J.F.(1998) Rates of spontaneous mutation. Genetics, 148, 1667–1686.

16. Martinsohn,J.T., Radman,M. and Petit,M.A. (2008) The lambdared proteins promote efficient recombination between diverged

sequences: implications for bacteriophage genome mosaicism.PLoS Genet., 4, e1000065.

17. Hatfull,G.F. (2008) Bacteriophage genomics. Curr. Opin.Microbiol., 11, 447–453.

18. Stahl,M.M., Thomason,L., Poteete,A.R., Tarkowski,T.,Kuzminov,A. and Stahl,F.W. (1997) Annealing vs. invasion inphage lambda recombination. Genetics, 147, 961–977.

19. Fenton,A.C. and Poteete,A.R. (1984) Genetic analysis of the erfregion of the bacteriophage P22 chromosome. Virology, 134,148–160.

20. Mosig,G., Gewin,J., Luder,A., Colowick,N. and Vo,D. (2001)Two recombination-dependent DNA replication pathways ofbacteriophage T4, and their roles in mutagenesis and horizontalgene transfer. Proc. Natl Acad. Sci. USA, 98, 8306–8311.

21. Hollis,T., Stattel,J.M., Walther,D.S., Richardson,C.C. andEllenberger,T. (2001) Structure of the gene 2.5 protein, asingle-stranded DNA binding protein encoded by bacteriophageT7. Proc. Natl Acad. Sci. USA, 98, 9557–9562.

22. Araki,H. and Ogawa,H. (1981) The participation of T7DNA-binding protein in T7 genetic recombination. Virology, 111,509–515.

23. Kong,D. and Richardson,C.C. (1996) Single-stranded DNAbinding protein and DNA helicase of bacteriophage T7 mediatehomologous DNA strand exchange. EMBO J., 15, 2010–2019.

24. Iyer,L.M., Koonin,E.V. and Aravind,L. (2002) Classification andevolutionary history of the single-strand annealing proteins, RecT,Redbeta, ERF and RAD52. BMC Genomics, 3, 8.

25. Passy,S.I., Yu,X., Li,Z., Radding,C.M. and Egelman,E.H. (1999)Rings and filaments of beta protein from bacteriophage lambdasuggest a superfamily of recombination proteins. Proc. Natl Acad.Sci. USA, 96, 4279–4284.

26. Ploquin,M., Bransi,A., Paquet,E.R., Stasiak,A.Z., Stasiak,A.,Yu,X., Cieslinska,A.M., Egelman,E.H., Moineau,S. andMasson,J.Y. (2008) Functional and structural basis for abacteriophage homolog of human RAD52. Curr. Biol., 18,1142–1146.

27. Poteete,A.R., Sauer,R.T. and Hendrix,R.W. (1983) Domainstructure and quaternary organization of the bacteriophage P22Erf protein. J. Mol. Biol., 171, 401–418.

28. Datsenko,K.A. and Wanner,B.L. (2000) One-step inactivation ofchromosomal genes in Escherichia coli K-12 using PCR products.Proc. Natl Acad. Sci. USA, 97, 6640–6645.

29. Muyrers,J.P., Zhang,Y., Testa,G. and Stewart,A.F. (1999) Rapidmodification of bacterial artificial chromosomes byET-recombination. Nucleic Acids Res., 27, 1555–1557.

30. Yu,D., Ellis,H.M., Lee,E.C., Jenkins,N.A., Copeland,N.G. andCourt,D.L. (2000) An efficient recombination system forchromosome engineering in Escherichia coli. Proc. Natl Acad. Sci.USA, 97, 5978–5983.

31. Ellis,H.M., Yu,D., DiTizio,T. and Court,D.L. (2001) Highefficiency mutagenesis, repair, and engineering of chromosomalDNA using single-stranded oligonucleotides. Proc. Natl Acad. Sci.USA, 98, 6742–6746.

32. Erler,A., Wegmann,S., Elie-Caille,C., Bradshaw,C.R., Maresca,M.,Seidel,R., Habermann,B., Muller,D.J. and Stewart,A.F. (2009)Conformational adaptability of Redbeta during DNA annealingand implications for its structural relationship with Rad52.J. Mol. Biol, 391, 586–598.

33. Bouchard,J.D. and Moineau,S. (2004) Lactococcal phage genesinvolved in sensitivity to AbiK and their relation to single-strandannealing proteins. J. Bacteriol., 186, 3649–3652.

34. Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z.,Miller,W. and Lipman,D.J. (1997) Gapped BLAST andPSI-BLAST: a new generation of protein database searchprograms. Nucleic Acids Res., 25, 3389–3402.

35. Soding,J. (2005) Protein homology detection by HMM-HMMcomparison. Bioinformatics, 21, 951–960.

36. Sali,A. and Blundell,T.L. (1993) Comparative protein modellingby satisfaction of spatial restraints. J. Mol. Biol., 234, 779–815.

37. Luthy,R., Bowie,J.U. and Eisenberg,D. (1992) Assessment ofprotein models with three-dimensional profiles. Nature, 356,83–85.

38. Sippl,M.J. (1993) Recognition of errors in three-dimensionalstructures of proteins. Proteins, 17, 355–362.

Nucleic Acids Research, 2010, Vol. 38, No. 12 3961

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022

Page 11: Detection of novel recombinases in bacteriophage genomes unveils

39. Clamp,M., Cuff,J., Searle,S.M. and Barton,G.J. (2004) TheJalview Java alignment editor. Bioinformatics, 20, 426–427.

40. DeLano,W.L. (2002) The PyMOL Molecular Graphics System.USA DeLano Scientific, San Carlos, CA.

41. Pupko,T., Bell,R.E., Mayrose,I., Glaser,F. and Ben-Tal,N. (2002)Rate4Site: an algorithmic tool for the identification of functionalregions in proteins by surface mapping of evolutionarydeterminants within their homologues. Bioinformatics, 18(Suppl 1),S71–S77.

42. Wolff,E., Kim,M., Hu,K., Yang,H. and Miller,J.H. (2004)Polymerases leave fingerprints: analysis of the mutationalspectrum in Escherichia coli rpoB to assess the role ofpolymerase IV in spontaneous mutation. J. Bacteriol., 186,2900–2905.

43. Kagawa,W., Kurumizaka,H., Ishitani,R., Fukai,S., Nureki,O.,Shibata,T. and Yokoyama,S. (2002) Crystal structure of thehomologous-pairing domain from the human Rad52 recombinasein the undecameric form. Mol. Cell, 10, 359–371.

44. Singleton,M.R., Wentzell,L.M., Liu,Y., West,S.C. andWigley,D.B. (2002) Structure of the single-strand annealingdomain of human RAD52 protein. Proc. Natl Acad. Sci. USA,99, 13492–13497.

45. Stasiak,A.Z., Larquet,E., Stasiak,A., Muller,S., Engel,A.,Van Dyck,E., West,S.C. and Egelman,E.H. (2000) The humanRad52 protein exists as a heptameric ring. Curr. Biol., 10,337–340.

46. Akiba,T., Ishii,N., Rashid,N., Morikawa,M., Imanaka,T. andHarata,K. (2005) Structure of RadB recombinase from ahyperthermophilic archaeon, Thermococcus kodakaraensis

KOD1: an implication for the formation of a near-7-fold helicalassembly. Nucleic Acids Res., 33, 3412–3423.

47. Liu,N., Lamerdin,J.E., Tebbs,R.S., Schild,D., Tucker,J.D.,Shen,M.R., Brookman,K.W., Siciliano,M.J., Walter,C.A., Fan,W.et al. (1998) XRCC2 and XRCC3, new human Rad51-familymembers, promote chromosome stability and protect againstDNA cross-links and other damages. Mol. Cell, 1, 783–793.

48. Mazin,A.V. and Kowalczykowski,S.C. (1998) The function of thesecondary DNA-binding site of RecA protein during DNA strandexchange. EMBO J., 17, 1161–1168.

49. Rezende,L.F., Willcox,S., Griffith,J.D. and Richardson,C.C.(2003) A single-stranded DNA-binding protein of bacteriophageT7 defective in DNA annealing. J. Biol. Chem., 278,29098–29105.

50. Datta,S., Costantino,N., Zhou,X. and Court,D.L. (2008)Identification and analysis of recombineering functions fromGram-negative and Gram-positive bacteria and their phages.Proc. Natl Acad. Sci. USA, 105, 1626–1631.

51. Nagaraju,G., Hartlerode,A., Kwok,A., Chandramouly,G. andScully,R. (2009) XRCC2 and XRCC3 regulate the balancebetween short- and long-tract gene conversion between sisterchromatids. Mol. Cell Biol, 29, 4283–4294.

52. Comeau,A.M., Bertrand,C., Letarov,A., Tetart,F. andKrisch,H.M. (2007) Modular architecture of the T4 phagesuperfamily: a conserved core genome and a plastic periphery.Virology, 362, 384–396.

53. Breitkreutz,B.J., Stark,C. and Tyers,M. (2002) Osprey: a networkvisualization system. Genome Biol., 3, R22.

3962 Nucleic Acids Research, 2010, Vol. 38, No. 12

Dow

nloaded from https://academ

ic.oup.com/nar/article/38/12/3952/2409217 by guest on 04 January 2022