article in press - pt7mdv.ceingebi.unam.mxpt7mdv.ceingebi.unam.mx/~erueda/cbc_2007.pdf · a.p....
TRANSCRIPT
C
3
4
5
6
7
8
9
10
11
A12
iIcThro
13
14
15
16
17
18
19
20
©21
K22
23
124
25
f26
s27
s28
s29
r30
p31
a32
s33
r34
R35
p36
T37
a38
1 12 d
PR
OO
F
ARTICLE IN PRESS+ModelBAC 5929 1–9
Computational Biology and Chemistry xxx (2007) xxx–xxx
Ligand-binding prediction in the resistance-nodulation-celldivision (RND) proteins
Armando Hernandez-Mendoza a, Carmen Quinto a,Lorenzo Segovia b, Ernesto Perez-Rueda a,∗
a Departmento de Biologıa Molecular de Plantas, Instituto de Biotecnologia, Universidad Nacional Autonoma de Mexico,A.P. 565-A Cuernavaca, Morelos 62210, Mexico
b Departamento de Ingenierıa Celular y Biocatalisis, Instituto de Biotecnologia, Universidad Nacional Autonoma de Mexico,A.P. 565-A Cuernavaca, Morelos 62210, Mexico
Received 18 May 2006; received in revised form 7 February 2007; accepted 8 February 2007
bstract
The resistance-nodulation-cell division (RND) protein family is a ubiquitous group of proteins primarily present in bacteria. These proteins,nvolved in the transport of multiple drugs across the cell envelope in bacteria, exhibit broad substrate specificity and act like efflux pumps.n this work, a protein belonging to the RND protein family, AcrB of Escherichia coli was used as a working model to predict in silico theompounds transported by 47 RND proteins. From AcrB we extracted and clustered 14 amino acids directly involved in substrate interactions.his clustering provides enough information to identify 16 groups that correlates with the ligand they extrude, such as proteins expelling aromatic
Dydrocarbons (SrpB cluster) or proteins expelling heavy metals (CnrA cluster). The relationship between conserved, cluster-specific and variableesidues indicates that although the ligand-binding domain is conserved in structure, it has enough flexibility to recognize specifically a diversityf molecules.
2007 Elsevier Ltd. All rights reserved.
o 39
2 40
41
i 42
a 43
d 44
a 45
f 46
c 47
t 48
t 49
i 50
d 51
OR
RE
CTEeywords: RND-proteins; AcrB; Ligand binding prediction
. Introduction
In general, homologous proteins share common biologicalunctions but exhibit different specificity towards their sub-trates. The identification of residues that participate in thepecificity of these interactions is useful for functional analy-is, and for predictive studies. Recent analyses have shown thatesidues involved in specific recognition between interactingartners are frequently found at conserved positions (Johnsonnd Church, 2000). These positions are evident in multipleequence alignments, where the distribution of amino acidseflects functional and structural constraints in the proteins.ecently, a number of algorithms addressing ligand-binding
UN
C
Please cite this article in press as: Hernandez-Mendoza, A. et al., Ligand-proteins, Computat. Biol. Chem. (2007), doi:10.1016/j.compbiolchem.200
redictions have been developed, such as the Evolutionaryrace method, that exploits information on protein sequencend structure (Lichtarge et al., 1996), whereas alternative meth-
∗ Corresponding author. Tel.: +52 56 22 76 10; fax: +52 777 3 17 23 88.E-mail address: [email protected] (E. Perez-Rueda).
a 52
i 53
b 54
atta
476-9271/$ – see front matter © 2007 Elsevier Ltd. All rights reserved.oi:10.1016/j.compbiolchem.2007.02.003
ds only use protein sequence information (Berezin et al.,004).
The resistance-nodulation-cell division (RND) protein familys a ubiquitous group of proteins described in bacteria, archaeand eukarya, which is involved in the transport of multiplerugs across the cell envelope (Paulsen et al., 1996; Paulsen etl., 1997). The RND proteins form complexes with membraneusion proteins (MFP) in the periplasm and with outer membranehannels of TolC superfamily to accomplish the transport fromhe cell to the extracellular medium. Diverse RND efflux sys-ems have been functionally characterized, which are involvedn the transport of and resistance to antibiotics, hydrophobicyes, and detergents, among others (Poole, 2004). In addition,large number of point mutations causing altered substrate or
nhibitor specificities in different multidrug transporters haveeen reported (Hearn et al., 2006; Murakami et al., 2004; Yu et
binding prediction in the resistance-nodulation-cell division (RND)7.02.003
l., 2005) as well as crystallographic data of purified AcrB in 55
he presence of several ligands (Yu et al., 2003). These results 56
ogether do not only suggest that these proteins form direct 57
tomic interactions with the molecules of substrates but a dif- 58
ED
IN+ModelC
2 nal B
f59
s60
61
f62
t63
c64
o65
c66
m67
c68
s69
s70
p71
i72
c73
i74
w75
e76
n77
h78
w79
280
281
82
s83
s84
b85
s86
o87
k88
(89
w90
d91
S92
293
l94
95
d96
197
K98
a99
o100
V101
s102
r103
a104
r105
i106
n107
F108
109
t110
n 111
p 112
a 113
1 114
s 115
l 116
d 117
a 118
t 119
s 120
o 121
o 122
2 123
124
p 125
w 126
c 127
t 128
a 129
l 130
f 131
e 132
p 133
d 134
a 135
l 136
S 137
138
S 139
t 140
p 141
I 142
o 143
a 144
3 145
3 146
147
A 148
F 149
t 150
i 151
t 152
o 153
m 154
b 155
o 156
d 157
R 158
NC
OR
RE
CT
ARTICLEBAC 5929 1–9
A. Hernandez-Mendoza et al. / Computatio
erential interaction can be achieved as a consequence of theubstrate nature.
In order to predict ligand molecules in the RND proteinamily in bacteria, we clustered 14 columns that correspondo amino acids selected from the AcrB protein of Escherichiaoli, whose 3D structure has been determined in the presencef four structurally diverse ligands (Murakami et al., 2002). E.oli AcrB is an excellent archetype to predict ligand bindingolecules, since it recognizes many structurally unrelated toxic
ompounds, such as antibiotics, disinfectants, dyes, and simpleolvents (Nikaido and Zgurskaya, 2001). In this work, a multipleequence alignment (MSA) was constructed with 47 RND-likerotein sequences and 14 columns were extracted correspond-ng to residues directly involved in substrate interactions into theentral cavity (Yu et al., 2003) of AcrB and to residues locatedn the central pore (Murakami et al., 2002). These 14 columnsere used to generate a tree that shows diverse RND multidrug
fflux family clusters that correlate with the ligand they recog-ize and extrude across the membrane cell. Similar approachesave shown the advantage of anchoring functional annotationsithin a protein family context (Johnson and Church, 2000).
. Methods
.1. Sequence analysis
The E. coli AcrB protein sequence, whose crystallographictructure was solved at 3.5 A, was used as seed in a Blastearch with an E-value ≤ 10−5 to identify RND protein mem-ers in the NR-database. Five hundred and sixty-five proteinequences were retrieved from this sequence comparison, butnly 43 RND-proteins were selected whose ligands are wellnown. In addition, RmiB and ORF2pC (Rhizobium etli), Orf2R. leguminosarum) and AmeB (Agrobacterium tumefaciens)ere included in the analysis because we have experimental evi-ences supporting a different function than previously reported.ee below.
.2. Multiple sequence alignments (MSA) andigand-binding-site analysis
Forty-seven RND proteins were aligned using ClustalX withefault conditions and manually edited (Thompson et al., 1997,994). In this alignment the AcrB protein sequence from E. coli12 was used as reference to identify and extract 10 amino
cid residues involved in ligand binding. Those residues are 6 Ar less from their ligands (forming preferentially hydrophobic,an der Waals, or electrostatic interactions) in the AcrB proteintructure (1IWG) (Yu et al., 2003). Four additional amino acidesidues identified experimentally by mutational analysis werelso included (Murakami et al., 2004). In total, 14 amino acidesidues were selected to perform the analysis: five residues aren the central pore (D101, V105, N109, Q112 and P116), and
UPlease cite this article in press as: Hernandez-Mendoza, A. et al., Ligandproteins, Computat. Biol. Chem. (2007), doi:10.1016/j.compbiolchem.200
ine in the central cavity (L25, K29, D99, V382, A385, F386,388, F458 and F459) (see Table A.1, Fig. A.1 in Appendix A).
In a second step, columns from the MSA corresponding tohe 14 ligand-binding residues were selected and used to build a
Atin
PR
OO
F
PRESSiology and Chemistry xxx (2007) xxx–xxx
ew alignment. This new alignment was analyzed by maximumarsimony (MP), fitch (F) and neighbor-joining (NJ) methods,nd their corresponding trees were generated. From this analysis,6 clusters subsequently emerged that correlated with the sub-trates they extrude. The MP tree clusters were used to predictigands in those proteins whose ligands are unknown or contra-ictories. Alternatively, 100 random MSAs with the same aminocid composition and length than the original were constructedo evaluate the previously described clustering. We must empha-ize that phylogenetic tools were only used to cluster similar setsf residues, not to make inferences on the evolutionary historyf these proteins.
.3. Performance evaluation
Finally, to evaluate the performance of the ligand-bindingredictions, the clusters identified by MP (“observed clusters”)ere compared to their corresponding annotated ligands. This
omparison was useful to calculate the following values: (1)rue positives (TP): proteins with (at least one) common lig-nd clustered together; (2) false positives (FP): proteins whoseigand was completely different to the rest of the cluster; (3)alse negatives (FN): proteins included in a cluster with differ-nt ligands; (4) sensitivity, Sn = TP/(TP + FN), is the fraction ofroteins recovered in the inferred clustering; (5) positive pre-ictive value, PPV = TP/(TP + FP), is the fraction of the proteinsnd ligands in the inferred clusters that belongs to the annotatedigand binding; (6) Accuracy, Ac = (Sn + PPV)/2, is the PPV andn average.
In this analysis the classical definition of specificityp = TN/(TN + FP) was not used, because our evaluation cri-
eria was based on the rate of true negatives (TN), defined asroteins whose ligand has not been experimentally described.ndeed, the number of ligands is typically smaller than the sizef the cluster (16 clusters), and the percentage of TN should belways closer to 1, which would favorably bias in the evaluation.
. Results
.1. Identification and selection of RND proteins
In order to identify members of the RND family, the E. colicrB protein sequence was used as seed to scan the NR database.ive hundred and sixty-five proteins were identified as poten-
ial RND-proteins and 43 proteins functionally characterizedn diverse bacteria were selected, ranging from proteobacteriao bacteroidetes (Fig. 1). These proteins share in average 40%f identity among them, and the information on their ligandolecules is known. Around 90% of the 565 proteins identified
y BLAST search have been annotated as hypothetical, unknownr uncharacterized proteins, emphasizing the importance of pre-ict their probable ligand compound. In addition, four proteinsmiB and ORF2pC (R. etli), Orf2 (R. leguminosarum) and
-binding prediction in the resistance-nodulation-cell division (RND)7.02.003
meB (A. tumefaciens) whose ligands are unknown or con- 159
radictories were considered in this analysis. AmeB might be 160
nvolved in ß-lactams and detergents extrusion, however its phe- 161
otype is not completely clear when the protein is deleted (Peng 162
CTE
D P
RO
OF
ARTICLE IN PRESS+ModelCBAC 5929 1–9
A. Hernandez-Mendoza et al. / Computational Biology and Chemistry xxx (2007) xxx–xxx 3
Fig. 1. Ligand prediction for the 47 RND proteins. (A) Tree generated with MP using the Phylip package program Protpars, with the 14 residue columns involved in theligand interaction reported for AcrB. Bars in red show the clusters identified. (B) Description of the clusters identified. Columns are as follow: protein name includedin the cluster, residues aligned and used to build the tree; substrate preferentially recognized by the proteins, and organisms where the protein has been characterized.In red are the residues from the central pore and in black residues from the central cavity. Substrates included—AH: aromatic hydrocarbons; AC: acriflavine; AG:aminoglycosides; BKC: benzalkonium chloride; BL: ß-lactams; BS: bile salts; CM: chloramphenicol; CP: cephalosporin; CV: crystal violet; COU: coumestrol;CHA: cholic acid; CHE: chenodeoxycholic acid; DC: deoxylchloate; EB: ethidium bromide; EM: erythromycin; FA: fatty acids; FQ: fluoroquinolones; FU: fusidica ML:h ium dt bromi
a163
w164
d165
R166
m167
s168
i169
m170
p171
s172
173
n174
f175
a176
f177
m178
o 179
t 180
R 181
i 182
b 183
e 184
o 185
c 186
s 187
c 188
e 189
b 190
NC
OR
REcid; H33342: hoechst 33342; HM: heavy metals (e.g., Co2+, Zn2+and Ag+);
ydrocarbon; PU: puromycin; RH6G: rhodamine 6G; RF: rifampicin; SDS: sodrimethoprim; TR: triclosan; TX: Triton X-100; TPP: tetraphenylphosphonium
nd Nester, 2001). RmiB and ORF2pC were included becausee have experimental evidences that support the notion theyo not play a role in the rhizobia-legume nodulation process.ecently, we identified in R. etli an operon resembling the RNDultidrug efflux system, rmiRABC, however, we did not found a
ubstrate extruded by this putative pump (Hernandez-Mendoza,n press). The clustering analysis proposed here do not onlyight give us clues about the most probable ligands of these
roteins, but also might suggest us the existence of additionalubstrates.
As it has been previously reported, RND proteins are orga-ized mostly in operons, together with genes from the membrane
UPlease cite this article in press as: Hernandez-Mendoza, A. et al., Ligand-proteins, Computat. Biol. Chem. (2007), doi:10.1016/j.compbiolchem.200
usion protein (MFP) family (Dinh et al., 1994), and occasion-lly transcribed together with genes from the outer membraneactor (OMF) family (Paulsen et al., 1997). In order to deter-ine how many efflux systems can be functionally traced per
thod
macrolides; MB: methylene blue; NO: novobiocin; PAH: polycyclic aromaticodecyl sulfate; TAU: taurocholic acid; TBT: tributyltin; TC: tetracycline; TM:de; UK: unknown; V: vanadium. In question marks are probable substrates.
rganism we made an exhaustive genomic context analysis usinghe GeConT program (Ciria et al., 2004). Interestingly, diverseND, MFP and OMF protein genes organized in operons were
dentified. Indeed we found an average of six efflux systems peracterial genome. In Pseudomonas aeruginosa and P. putida, 13fflux systems were identified (E-value ≤ 2.3E−49), of whichnly six have been experimentally characterized to date. In E.oli K12, seven RND transporters were identified based onequence comparisons (E-value ≤ 3.2–43), six of them fullyharacterized. Finally, in A. tumefaciens, eight probable RND-fflux pumps were predicted (E-value ≤ 3.04E−44) though theiological function has been experimentally determined for only
binding prediction in the resistance-nodulation-cell division (RND)7.02.003
wo of them (Palumbo et al., 1998; Peng and Nester, 2001). The 191
igh number of RND-proteins in diverse organisms does not 192
nly suggest a high duplication rate but also a probable high 193
iversity of compounds that can be expelled by those systems. 194
ED
OF
ARTICLE IN PRESS+ModelCBAC 5929 1–9
4 A. Hernandez-Mendoza et al. / Computational Biology and Chemistry xxx (2007) xxx–xxx
Table 1Conserved and variable residues in the RND-protein clusters
e proclust
uded.
I195
g196
c197
M198
b199
b200
3201
202
m203
1204
c205
l206
f207
a208
s209
a210
r211
n212
r213
e214
a215
e216
fi217
o218
h219
E220
221
1222
o223
a224
p225
o226
v227
t 228
p 229
a 230
c 231
c 232
e 233
p 234
1 235
t 236
r 237
m 238
h 239
240
p 241
R 242
m 243
c 244
( 245
s 246
a 247
i 248
s 249
t 250
a 251
a 252
t 253
t 254
c 255
256
a 257
p 258
NC
OR
RE
CT
“X” denotes 100% identical residues. The first column has a representativper position: for instance, F458 is conserved in all proteins included in the(black letters); CP: central pore (red letters). Orphan groups were not incl
n addition, the organization in operons of the MFP and RNDenes suggests a probable co-evolution process to protect theell from toxic compounds. This finding also suggest that RND,FP and OMF act in conjunction to mediate transport across
oth membranes of the cell envelope, as it has been evidencedy genetic experiments (Dinh et al., 1994).
.2. Ligand prediction in members of the RND family
In order to predict the most probable ligands in 47 selectedembers of the RND family, 14 columns out of approximately
300 in a MSA were picked out, as described in Section 2. Theseolumns correspond to the amino acids involved in E. coli AcrBigand-binding, five residues from the central pore, and ninerom the central cavity (Fig. 1 and Table 1). These residuesre distributed throughout the primary amino acid structure inuch a way that there is no local sequence motif that may bessociated to a particular binding function. The columns cor-esponding to the 14 selected residues were extracted into aew MSA to complete the clustering analysis. Still, additionalesidues might be involved in ligand binding, we only consid-red 14 amino acid residues since they have been well analyzeds involved in physical interactions with diverse compoundsxtruded by AcrB of E. coli. Indeed, Hearn et al. (2006) identi-ed a diversity of potential residues (most of them included inur analysis) that affect the selectivity for polycyclic aromaticydrocarbons, antibiotics and solvents in P. fluorescens cLP6amhB, supporting the importance of the selected residues.
Based on the MP analysis of the new MSA we identify6 groups of proteins that correlate with the ligand preferencebserved for several RND proteins (Fig. 1). The calculated aver-
UPlease cite this article in press as: Hernandez-Mendoza, A. et al., Ligandproteins, Computat. Biol. Chem. (2007), doi:10.1016/j.compbiolchem.200
ge sensitivity for all clusters analyzed reaches 94.0% and aositive predictive value (PPV) of 88.0%, leading to an accuracyf 91.0%. This indicates the significance of our analysis. Theariability observed among these residues should explain why
dcao
PR
Otein for the cluster. Second to fifteenth columns indicate the conservationer “AcrD”. Cluster-specific residues are in parenthesis. CC: central cavity
hese proteins recognize such diversity of ligands. In Fig. 1, weresent 14 clusters identified by MP, and in Table 1 we presentsummary of conserved positions per cluster. These clusters
ontained variability in the 14 residues of the central cavity andentral pore, and some positions (such as F458) can be consid-red as universal and some other as cluster specific. From thiserspective, diverse clusters conserve between 0 and 13 out of4 residues involved in ligand binding. We do not only suggesthat this variability inside clusters is enough to differentiate theirespective ligands but also that conserved positions per clusteright define the major recognition of the ligands, however this
ypothesis should be further explored.Alternatively, to contrast the 14-columns tree clusters we
erformed a similar analysis by using MP with the completeND-protein sequences. The results obtained show that theajority of the RND-proteins are grouped within a single large
losely related cluster, while proteins extruding heavy metalsHM) are clustered together (data not shown), as previouslyhown (Paulsen et al., 1996). This result suggests that thenalysis of the complete RND protein sequences alignment isnfluenced by the organism speciation rather than by substratepecificity. Conversely, random MSA clustering did not correlateo the ligand-binding substrates observed with the 14 functionalmino acid residues, except for the HM cluster (which, as shownbove, is divergent from the other sequences) suggesting thathe conservation of the selected residues in their specific posi-ions is influenced by the interaction with the molecules of theompounds extruded by the protein.
In summary, we found that 14-columns with selected aminocids involved in ligand interaction are good enough to grouproteins with similar ligands, and they might be useful to pre-
-binding prediction in the resistance-nodulation-cell division (RND)7.02.003
ict ligands in proteins whose substrate is unknown. A tree 259
onstructed with the whole sequences and random columns 260
re not enough informative, hence showing the significance 261
f our approach. The results obtained for certain groups are 262
D
IN+ModelC
nal B
d263
(264
3265
266
n267
i268
p269
L270
s271
T272
a273
r274
a275
a276
d277
v278
a279
p280
l281
n282
f283
p284
3285
c286
287
r288
e289
(290
p291
2292
t293
t294
t295
w296
s297
v298
l299
s300
t301
R302
i303
A304
3305
306
C307
t308
e309
1310
i311
t312
(313
e 314
t 315
H 316
g 317
d 318
b 319
b 320
h 321
w 322
3 323
324
t 325
g 326
1 327
– 328
m 329
o 330
a 331
( 332
m 333
i 334
w 335
g 336
337
L 338
N 339
i 340
r 341
s 342
3 343
344
( 345
i 346
a 347
2 348
a 349
s 350
t 351
l 352
p 353
4 354
355
a 356
a 357
T 358
a 359
t 360
NC
OR
RE
CTE
ARTICLEBAC 5929 1–9
A. Hernandez-Mendoza et al. / Computatio
epicted below. Additional groups are described in Appendix ATable A.1).
.3. Proteins with broad substrate specificity (AcrB cluster)
A group of three proteins including E. coli AcrB, P. aerugi-osa MexB and Salmonella enterica serovar Typhimurium AcrBs shown in Fig. 1B. These proteins have been described as RND-roteins with broad substrate specificity (Baucheron et al., 2002;i et al., 1995; Sulavik et al., 2001). The MP tree using exclu-ively the 14 residues clustered together these proteins (Fig. 1A).his result indicates that these residues are enough to differenti-te proteins with preference for similar ligands. The 14 analyzedesidues are highly conserved among members of this groups only K29 varies (Table 1). The three experimentally char-cterized proteins bind similar ligands: antibiotics, dyes andetergents (Nikaido and Zgurskaya, 2001), and all might haveery broad substrate specificity (Fig. 1B). The substrate diversityssociated to the group suggests that the observed residues mightarticipate in the major recognition to their ligands. Whereasess conserved residues (such as K29 and alternative residuesot considered here) might have specific ligand-contacts at dif-erent conditions, i.e. they might be exposed in response to theresence of a particular ligand.
.4. Proteins expelling aromatic hydrocarbons (SrpBluster)
The analysis of the cluster (SrpB) demonstrates the accu-acy of the approach described in this work, since five proteinsxtruding aromatic hydrocarbons (AH), such as P. putida TtgHtoluene tolerance) (Rojas et al., 2001), SrpB (solvent-resistantump) (Kieboom and de Bont, 2001), SepB (Phoenix et al.,003), and TtgE proteins (Rojas et al., 2001) (Fig. 1B) were clus-ered together. In addition, P. stutzeri TbtB extrudes both AH andributyltin, an organotin compound (Jude et al., 2004), was clus-ered in this group. Twelve amino acid residues are conservedithin this group. From these residues two positions are cluster
pecific V382 and A385, while D99 and F388 are completelyariable (Table 1), suggesting that these changes are essential forigand selectivity, i.e. they might participate discriminating theirpecific ligands, while conserved residues might be performinghe major contacts with AH. We suggest that new hypotheticalND-proteins whose amino acid sequence patterns will be sim-
lar to the proteins included in this cluster might be extrudingH or related compounds.
.5. Proteins expelling heavy metals (CnrA cluster)
The CnrA cluster (Fig. 1A) includes five proteins (CnrA,zcA, CusA, SilA and MexK) sharing 31.5% amino acid iden-
ity. These proteins bind cations or heavy metals (HM) (Guptat al., 2001; Liesegang et al., 1993; Munson et al., 2000; Nies,
UPlease cite this article in press as: Hernandez-Mendoza, A. et al., Ligand-proteins, Computat. Biol. Chem. (2007), doi:10.1016/j.compbiolchem.200
995), except for the P. aeruginosa MexK, which is involvedn the resistance to tetracycline and to the amphiphilic drugriclosan, a broad-spectrum biocide (Chuanchuen et al., 2002)Fig. 1B). We found that proteins included in this group do not
bgao
PR
OO
F
PRESSiology and Chemistry xxx (2007) xxx–xxx 5
xhibit any conserved residue from the 14 selected, representinghe most divergent cluster identified (Table 1). It is intriguing thatM and MexK proteins are included in the same cluster. We sug-est that MexK might recognize and extrude HM compounds,ue its inclusion in the cluster; however this hypothesis shoulde further explored. In summary, CnrA-proteins might exhibit ainding pocket able to bind molecules with similar size or mightave similar physicochemical properties not discernible in thisork.
.6. Proteins expelling antibiotics (MexD cluster)
Three proteins expelling fluoroquinolones, chloramphenicol,etracycline and aminoglycoside antibiotics were included in thisroup: AdeB and MexD that discharge antibiotics (Gotoh et al.,998; Magnet et al., 2001) and our hypothetical protein Orf2pCidentified in our laboratory – that does not have an experi-entally evident substrate so far (Fig. 1B). The results obtained
ut of this cluster analysis show the predictive potential of thispproach, i.e. tetracycline, aminoglycoside, fluoroquinolonesnorfloxacin or ciprofloxacin) or chemically similar compounds,ight be specifically recognized by Orf2plC. This proposal
s based exclusively in the clustering analysis described here,here AdeB, MexD and Orf2pC are included in a particularroup (Fig. 1A).
In this cluster, only five amino acid positions are conserved:25, D99, N109, F388 and F458 (Table 1). All of them, except109, are located in the central cavity where most of the chem-
cal interactions with the ligands occur. Thus, these conservedesidues could be participating in the specific recognition of theirubstrates.
.7. Proteins expelling unknown compounds (AmeB cluster)
Finally, RagC (B. japonicum), AmeB (A. tumefaciens), RmiBR. etli), and orf2 (R. leguminosarum bv. viciae) were includedn the same cluster. RagC might be involved in heavy met-ls or antibiotics extrusion (Krummenacher and Narberhaus,000), AmeB in extrusion of ß-lactams and detergents (Pengnd Nester, 2001), and RmiB and Orf2 do not exhibit an evidentubstrate. Though these proteins were included in a same clus-er we do not have enough evidences supporting their probableigands, however they might have HM or Ac as substrates orrobably another not considered here.
. Discussion
In the present work, we evidenced that 14 ligand-bindingmino acids, out of approximately 1300 columns from a MSA,re informative enough for ligand prediction in 47 RND proteins.hese residues are distributed throughout the primary aminocid structure in such a way that there is no local sequence motifhat may be associated to a particular binding function. On this
binding prediction in the resistance-nodulation-cell division (RND)7.02.003
asis, several defined groups we made showed specificity for a 361
iven ligand. The PPV in average is equal to 88.0%, suggesting 362
high level of confidence in these predictions. The data here 363
btained not only suggest that the substrate specificity is deter- 364
IN+ModelC
6 nal B
m365
p366
l367
u368
a369
t370
a371
t372
t373
a374
c375
c376
fi377
b378
379
R380
t381
t382
r383
r 384
m 385
p 386
m 387
A 388
389
w 390
f 391
A 392
r 393
394
TC
G
X
M
A
M
A
S
S
E
A
ARTICLEBAC 5929 1–9
A. Hernandez-Mendoza et al. / Computatio
ined by the properties of these 14 residues, but also that therotein fold adapts to accommodate a considerable diversity ofigands, i.e. that each ligand, in association to the protein clusterses, a slightly different subset of binding residues. Thus, anncestral generic ligand-binding domain would have divergedo expel diverse ligands and, as a result, only a particular set ofctive site residues have been conserved throughout evolution. Inhis regard, the AcrB group contains diverse conserved residueshat can be classified in variable or not conserved; conservedre those residues with identical physicochemical properties asompared to AcrB of E. coli and residues conserved in a specificluster. All these residues together correlate to its substrate speci-city, probably by using a discriminatory mechanism mediatedy conserved and cluster-specific residues.
The approach presented here, will help to understand the
UN
CO
RR
EC
TED
Please cite this article in press as: Hernandez-Mendoza, A. et al., Ligandproteins, Computat. Biol. Chem. (2007), doi:10.1016/j.compbiolchem.200
ND protein specificity in bacteria and other organisms, ando decipher the evolution of these proteins in the context ofheir cognate ligands. Finally, this procedure not only describeesidues functionally important which sequence pattern does not
ctHa
able A.1lusters identified by a 14-residues cluster in RND proteins
roup Proteins Main ligand
epB XepB Ethidium bromide, puromyrifampicin
trD MtrD Ethidium bromide, �-lactafatty acids, Triton X-100
crD Escherichia coli AcrD Antibiotics with amphiphilproperties, such as glycosid
Burkholderia cepacia CeoBexW Pseudomonas aeruginosa MexW, MexI Novobiocin
E. coli MdtC Ethidium bromideHaemophilus influenzae AcrB ß-Lactams
meB Bradyrhizobium japonicum RagC Unknown
Rhizobium etli RmeBAgrobacterium tumefaciens AmeBR. leguminosarum bv. viciae Orf2
meE Stenotrophomonas maltophilia SmeE Macrolides
E. coli YhiV Ethidium bromidedeY Serratia marcescens SdeY Macrolides
B. pseudomallei BpeB Isoflavonoid coumestrolA. tumefaciens IfeB
mhB P. putida TtgB, ArpB Chloramphenicol
P. fluorescens EmhB HAß-Lactams
crF E. coli AcrF Detergents (SDS, bile salts
Erwinia amylovora AcrB Dyes (crystal violet, acriflaEnterobacter aerogenes AcrB
F
PRESSiology and Chemistry xxx (2007) xxx–xxx
eflect an evident sequence motif, but also provides diverse ele-ents to be expanded in new hypothetical RND proteins and
otentially to alternative protein families where ligand bindingodifies allosterically the protein.
cknowledgements
AHM was supported by a grant (90288) from CONACyT. Weould like to thank Enrique Merino Perez and Edmundo Calva
or their discussion and comments.
ppendix A. Ligand-binding prediction in theesistance-nodulation-cell division (RND) proteins
Based on the selection and their posterior clustering of 14
PR
OO
-binding prediction in the resistance-nodulation-cell division (RND)7.02.003
olumns involved in ligand interaction, 16 groups were iden- 395
ified in which almost all members share a common substrate. 396
enceforth, we describe the rest of the groups reported in this 397
rticle (Table A.1 and Fig. A.1). 398
Conservedpositions
Notes
cin, Orphan group (Ikeda and Yoshimura, 2002)
ms, Orphan group (Rouquette-Loughlin et al., 2002)
ices
4 CP Elkins and Nikaido (2002) and Nair et al. (2004)
4 CC2 CP MexI has been also associated to quorum
sensing (Aendekerk et al., 2002). MexW, MdtCand AcrB (Li et al., 2003; Nagakubo et al., 2002;Sanchez et al., 1997)
1 CP Proteins of this cluster have been exclusivelyidentified in Rhizobiaceae (Krummenacher andNarberhaus, 2000; Peng and Nester, 2001)
2 CC
5 PC Chang et al. (2004) and Nishino and Yamaguchi(2001)
4 CC3 CP IfeB could have a role in macrolide resistance,
although this possibility has not been explored(Palumbo et al., 1998). SdeY and BpeB (Chan etal., 2004; Chen et al., 2003)
4 CC
5 CP Changes in residues located in the EmhB centralcavity change substrate specificity (Hearn et al.,2006).TtgB, ArpB (Kieboom and de Bont, 2001;Mosqueda and Ramos, 2000)
8 CC
) 5 CP At least two members use similar compounds(Burse et al., 2004; Nishino and Yamaguchi,2001; Pradel and Pages, 2002)
vine) 6 CC
RE
CTE
D P
RO
OF
ARTICLE IN PRESS+ModelCBAC 5929 1–9
A. Hernandez-Mendoza et al. / Computational Biology and Chemistry xxx (2007) xxx–xxx 7
Table A.1 (Continued )
Group Proteins Main ligand Conservedpositions
Notes
SmeB S. maltophilia SmeB Unknown 3 CP Li et al. (2002), Masuda et al. (2000) and Mooreet al. (1999)
P. aeruginosa MexY Aminoglycoside 2 CCB. pseudomallei AmrB Macrolide
SdeB S. marcescens SdeB Fluoroquinolones 2 CP Kohler et al. (1997), Kumar and Worobec(2005), Lin et al. (2002) and Maseda et al. (2000)
Campylobacter jejuni CmeB ß-Lactams 5 CCP. aeruginosa MexF
NolG Sinorhizobium meliloti NolG Novobiocin and deoxycholate(MdtB)
1 CP There is no experimental evidence supportingthe proposed NolG (Baev et al., 1991) functionas a putative lipo-chito-oligosaccharide (Nodfactor) exporter (Saier et al., 1994). Based onthis analysis, we propose that NolG might beinvolved in extrusion of antibiotics such asfluoroquinolones and novobiocin or the bile saltdeoxicholate rather than in theRhizobium-legume symbiosis process
E. coli MdtB 4 CC MdtB (Nagakubo et al., 2002)
Columns are as follow: the first column has a representative protein for the cluster, second to fifth indicate the bacteria and proteins included in the cluster, mainligands associated to the protein cluster, number and location of conserved residues in the cluster (CP, central pore; CC, central cavity), and observations.
Fig. A.1. Mapping of the selected residues on the AcrB structure. (A) PyMol ribbon diagram of the E. coli AcrB multidrug efflux pump (PDB 1IWG), showing ther 09), QR resid
R399
B400
401
402
403
B404
405
406
C407
408
C409
410
411
D 412
413
414
G 415
416
417
418
G 419
420
NC
OResidues selected for this study: L (25), K (29), D (99), D (101), V (105), N (1
ibbon diagram of two AcrB subunits indicating the localization of the selected
eferences
aucheron, S., Imberechts, H., Chaslus-Dancla, E., Cloeckaert, A., 2002. TheAcrB multidrug transporter plays a major role in high-level fluoroquinoloneresistance in Salmonella enterica serovar typhimurium phage type DT204.Microb. Drug Resist. 8, 281–289.
erezin, C., Glaser, F., Rosenberg, J., Paz, I., Pupko, T., Fariselli, P., Casadio, R.,Ben-Tal, N., 2004. ConSeq: the identification of functionally and structurallyimportant residues in protein sequences. Bioinformatics 20, 1322–1324.
UPlease cite this article in press as: Hernandez-Mendoza, A. et al., Ligand-proteins, Computat. Biol. Chem. (2007), doi:10.1016/j.compbiolchem.200
iria, R., Abreu-Goodger, C., Morett, E., Merino, E., 2004. GeConT: genecontext analysis. Bioinformatics 20, 2307–2308.
huanchuen, R., Narasaki, C.T., Schweizer, H.P., 2002. The MexJK efflux pumpof Pseudomonas aeruginosa requires OprM for antibiotic efflux but not forefflux of triclosan. J. Bacteriol. 184, 5036–5044.
H
(112), P (116), V (382), A (385), F (386), F (388), F (458) and F (459). (B)ues into the cavity region. Colors represent selected residues per protomer.
inh, T., Paulsen, I.T., Saier Jr., M.H., 1994. A family of extracytoplasmicproteins that allow transport of large molecules across the outer membranesof gram-negative bacteria. J. Bacteriol. 176, 3825–3831.
otoh, N., Tsujimoto, H., Tsuda, M., Okamoto, K., Nomura, A., Wada,T., Nakahashi, M., Nishino, T., 1998. Characterization of the MexC-MexD-OprJ multidrug efflux system in DeltamexA-mexB-oprM mutants ofPseudomonas aeruginosa. Antimicrob. Agents Chemother. 42, 1938–1943.
upta, A., Phung, L.T., Taylor, D.E., Silver, S., 2001. Diversity of silver resis-tance genes in IncH incompatibility group plasmids. Microbiology 147,
binding prediction in the resistance-nodulation-cell division (RND)7.02.003
3393–3402. 421
earn, E.M., Gray, M.R., Foght, J.M., 2006. Mutations in the central cavity 422
and periplasmic domain affect efflux activity of the resistance-nodulation- 423
division pump EmhB from Pseudomonas fluorescens cLP6a. J. Bacteriol. 424
188, 115–123. 425
ED
IN+ModelC
8 nal B
J426
427
428
J429
430
431
K432
433
434
K435
436
437
L438
439
440
L441
442
443
L444
445
446
M447
448
449
450
M451
452
453
M454
455
456
M457
458
459
N460
461
462
N463
464
P465
466
467
P468
469
P470
471
472
473
P474
475
P476
477
478
479
P480
481
R482
483
484
485
S486
487
488
489
T490
491
492
493
T 494
495
496
497
Y 498
499
500
Y 501
502
503
F 504
A 505
506
507
B 508
509
510
511
B 512
513
514
C 515
516
517
C 518
519
520
521
C 522
523
524
E 525
526
527
528
I 529
530
531
K 532
533
534
535
K 536
537
538
L 539
540
541
L 542
543
544
545
L 546
547
548
M 549
550
551
M 552
553
NC
OR
RE
CT
ARTICLEBAC 5929 1–9
A. Hernandez-Mendoza et al. / Computatio
ohnson, J.M., Church, G.M., 2000. Predicting ligand-binding function infamilies of bacterial receptors. Proc. Natl. Acad. Sci. U.S.A. 97, 3965–3970.
ude, F., Arpin, C., Brachet-Castang, C., Capdepuy, M., Caumette, P., Quentin,C., 2004. TbtABM, a multidrug efflux pump associated with tributyltinresistance in Pseudomonas stutzeri. FEMS Microbiol. Lett. 232, 7–14.
ieboom, J., de Bont, J., 2001. Identification and molecular characterization ofan efflux system involved in Pseudomonas putida S12 multidrug resistance.Microbiology 147, 43–51.
rummenacher, P., Narberhaus, F., 2000. Two genes encoding a putative mul-tidrug efflux pump of the RND/MFP family are cotranscribed with an rpoHgene in Bradyrhizobium japonicum. Gene 241, 247–254.
i, X.Z., Nikaido, H., Poole, K., 1995. Role of mexA-mexB-oprM in antibi-otic efflux in Pseudomonas aeruginosa. Antimicrob. Agents Chemother. 39,1948–1953.
ichtarge, O., Bourne, H.R., Cohen, F.E., 1996. An evolutionary trace methoddefines binding surfaces common to protein families. J. Mol. Biol. 257,342–358.
iesegang, H., Lemke, K., Siddiqui, R.A., Schlegel, H.G., 1993. Characteri-zation of the inducible nickel and cobalt resistance determinant cnr frompMOL28 of Alcaligenes eutrophus CH34. J. Bacteriol. 175, 767–778.
agnet, S., Courvalin, P., Lambert, T., 2001. Resistance-nodulation-celldivision-type efflux pump involved in aminoglycoside resistance in Acine-tobacter baumannii strain BM4454. Antimicrob. Agents Chemother. 45,3375–3380.
unson, G.P., Lam, D.L., Outten, F.W., O’Halloran, T.V., 2000. Identifica-tion of a copper-responsive two-component system on the chromosome ofEscherichia coli K-12. J. Bacteriol. 182, 5864–5871.
urakami, S., Nakashima, R., Yamashita, E., Yamaguchi, A., 2002. Crys-tal structure of bacterial multidrug efflux transporter AcrB. Nature 419,587–593.
urakami, S., Tamura, N., Saito, A., Hirata, T., Yamaguchi, A., 2004.Extramembrane central pore of multidrug exporter AcrB in Escherichia coliplays an important role in drug transport. J. Biol. Chem. 279, 3743–3748.
ies, D.H., 1995. The cobalt, zinc, and cadmium efflux system CzcABC fromAlcaligenes eutrophus functions as a cation-proton antiporter in Escherichiacoli. J. Bacteriol. 177, 2707–2712.
ikaido, H., Zgurskaya, H.I., 2001. AcrAB and related multidrug efflux pumpsof Escherichia coli. J. Mol. Microbiol. Biotechnol. 3, 215–218.
alumbo, J.D., Kado, C.I., Phillips, D.A., 1998. An isoflavonoid-inducible effluxpump in Agrobacterium tumefaciens is involved in competitive colonizationof roots. J. Bacteriol. 180, 3107–3113.
aulsen, I.T., Brown, M.H., Skurray, R.A., 1996. Proton-dependent multidrugefflux systems. Microbiol. Rev. 60, 575–608.
aulsen, I.T., Park, J.H., Choi, P.S., Saier Jr., M.H., 1997. A family of gram-negative bacterial outer membrane factors that function in the export ofproteins, carbohydrates, drugs and heavy metals from gram-negative bacte-ria. FEMS Microbiol. Lett. 156, 1–8.
eng, W.T., Nester, E.W., 2001. Characterization of a putative RND-type effluxsystem in Agrobacterium tumefaciens. Gene 270, 245–252.
hoenix, P., Keane, A., Patel, A., Bergeron, H., Ghoshal, S., Lau, P.C., 2003.Characterization of a new solvent-responsive gene locus in Pseudomonasputida F1 and its functionalization as a versatile biosensor. Environ. Micro-biol. 5, 1309–1327.
oole, K., 2004. Efflux-mediated multiresistance in gram-negative bacteria.Clin. Microbiol. Infect. 10, 12–26.
ojas, A., Duque, E., Mosqueda, G., Golden, G., Hurtado, A., Ramos, J.L.,Segura, A., 2001. Three efflux pumps are required to provide efficienttolerance to toluene in Pseudomonas putida DOT-T1E. J. Bacteriol. 183,3967–3973.
ulavik, M.C., Houseweart, C., Cramer, C., Jiwani, N., Murgolo, N., Greene,J., DiDomenico, B., Shaw, K.J., Miller, G.H., Hare, R., Shimer, G., 2001.Antibiotic susceptibility profiles of Escherichia coli strains lacking mul-
UPlease cite this article in press as: Hernandez-Mendoza, A. et al., Ligandproteins, Computat. Biol. Chem. (2007), doi:10.1016/j.compbiolchem.200
tidrug efflux pump genes. Antimicrob. Agents Chemother. 45, 1126–1136.hompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G.,
1997. The CLUSTAL X windows interface: flexible strategies for multi-ple sequence alignment aided by quality analysis tools. Nucl. Acids Res. 25,4876–4882.
554
M
PR
OO
F
PRESSiology and Chemistry xxx (2007) xxx–xxx
hompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improvingthe sensitivity of progressive multiple sequence alignment through sequenceweighting, position-specific gap penalties and weight matrix choice. Nucl.Acids Res. 22, 4673–4680.
u, E.W., Aires, J.R., McDermott, G., Nikaido, H., 2005. A periplasmic drug-binding site of the AcrB multidrug efflux pump: a crystallographic and site-directed mutagenesis study. J. Bacteriol. 187, 6804–6815.
u, E.W., McDermott, G., Zgurskaya, H.I., Nikaido, H., Koshland Jr., D.E.,2003. Structural basis of multiple drug-binding capacity of the AcrB mul-tidrug efflux pump. Science 300, 976–980.
urther reading
endekerk, S., Ghysels, B., Cornelis, P., Baysse, C., 2002. Characterization ofa new efflux pump, MexGHI-OpmD, from Pseudomonas aeruginosa thatconfers resistance to vanadium. Microbiology 148, 2371–2381.
aev, N., Endre, G., Petrovics, G., Banfalvi, Z., Kondorosi, A., 1991. Six nodula-tion genes of nod box locus 4 in Rhizobium meliloti are involved in nodulationsignal production: nodM codes for d-glucosamine synthetase. Mol. Gen.Genet. 228, 113–124.
urse, A., Weingart, H., Ullrich, M.S., 2004. The phytoalexin-inducible mul-tidrug efflux pump AcrAB contributes to virulence in the fire blight pathogenErwinia amylovora. Mol. Plant Microb. Interact. 17, 43–54.
han, Y.Y., Tan, T.M., Ong, Y.M., Chua, K.L., 2004. BpeAB-OprB, a multidrugefflux pump in Burkholderia pseudomallei. Antimicrob. Agents Chemother.48, 1128–1135.
hang, L.L., Chen, H.F., Chang, C.Y., Lee, T.M., Wu, W.J., 2004. Contributionof integrons, and SmeABC and SmeDEF efflux pumps to multidrug resis-tance in clinical isolates of Stenotrophomonas maltophilia. J. Antimicrob.Chemother. 53, 518–521.
hen, J., Kuroda, T., Huda, M.N., Mizushima, T., Tsuchiya, T., 2003. AnRND-type multidrug efflux pump SdeXY from Serratia marcescens. J.Antimicrob. Chemother. 52, 176–179.
lkins, C.A., Nikaido, H., 2002. Substrate specificity of the RND-type mul-tidrug efflux pumps AcrB and AcrD of Escherichia coli is determinedpredominantly by two large periplasmic loops. J. Bacteriol. 184, 6490–6498.
keda, T., Yoshimura, F., 2002. A resistance-nodulation-cell division familyxenobiotic efflux pump in an obligate anaerobe Porphyromonas gingivalis.Antimicrob. Agents Chemother. 46, 3257–3260.
ohler, T., Michea-Hamzehpour, M., Henze, U., Gotoh, N., Curty, L.K., Pechere,J.C., 1997. Characterization of MexE-MexF-OprN, a positively regulatedmultidrug efflux system of Pseudomonas aeruginosa. Mol. Microbiol. 23,345–354.
umar, A., Worobec, E.A., 2005. Cloning, sequencing, and characterizationof the SdeAB multidrug efflux pump of Serratia marcescens. Antimicrob.Agents Chemother. 49, 1495–1501.
i, X.Z., Zhang, L., Poole, K., 2002. SmeC, an outer membrane multidrug effluxprotein of Stenotrophomonas maltophilia. Antimicrob. Agents Chemother.46, 333–343.
i, Y., Mima, T., Komori, Y., Morita, Y., Kuroda, T., Mizushima, T., Tsuchiya,T., 2003. A new member of the tripartite multidrug efflux pumps, MexVW-OprM, in Pseudomonas aeruginosa. J. Antimicrob. Chemother. 52, 572–575.
in, J., Michel, L.O., Zhang, Q., 2002. CmeABC functions as a multidrugefflux system in Campylobacter jejuni. Antimicrob. Agents Chemother. 46,2124–2131.
aseda, H., Yoneyama, H., Nakae, T., 2000. Assignment of the substrate-selective subunits of the MexEF-OprN multidrug efflux pump ofPseudomonas aeruginosa. Antimicrob. Agents Chemother. 44, 658–664.
asuda, N., Sakagawa, E., Ohya, S., Gotoh, N., Tsujimoto, H., Nishino, T.,2000. Contribution of the MexX-MexY-oprM efflux system to intrinsicresistance in Pseudomonas aeruginosa. Antimicrob. Agents Chemother. 44,
-binding prediction in the resistance-nodulation-cell division (RND)7.02.003
2242–2246. 555
oore, R.A., DeShazer, D., Reckseidler, S., Weissman, A., Woods, D.E., 1999. 556
Efflux-mediated aminoglycoside and macrolide resistance in Burkholderia 557
pseudomallei. Antimicrob. Agents Chemother. 43, 465–470. 558
IN+ModelC
nal B
M559
560
561
N562
563
564
565
N566
567
568
N569
570
P 571
572
573
R 574
575
576
577
S 578
ARTICLEBAC 5929 1–9
A. Hernandez-Mendoza et al. / Computatio
osqueda, G., Ramos, J.L., 2000. A set of genes encoding a second tolueneefflux system in Pseudomonas putida DOT-T1E is linked to the tod genesfor toluene metabolism. J. Bacteriol. 182, 937–943.
agakubo, S., Nishino, K., Hirata, T., Yamaguchi, A., 2002. The putativeresponse regulator BaeR stimulates multidrug resistance of Escherichiacoli via a novel multidrug exporter system, MdtABC. J. Bacteriol. 184,4161–4167.
air, B.M., Cheung Jr., K.J., Griffith, A., Burns, J.L., 2004. Salicylate induces
UN
CO
RR
EC
TED
Please cite this article in press as: Hernandez-Mendoza, A. et al., Ligand-proteins, Computat. Biol. Chem. (2007), doi:10.1016/j.compbiolchem.200
an antibiotic efflux pump in Burkholderia cepacia complex genomovar III(B. cenocepacia). J. Clin. Invest. 113, 464–473.
ishino, K., Yamaguchi, A., 2001. Analysis of a complete library of puta-tive drug transporter genes in Escherichia coli. J. Bacteriol. 183, 5803–5812.
S
PRESSiology and Chemistry xxx (2007) xxx–xxx 9
radel, E., Pages, J.M., 2002. The AcrAB-TolC efflux pump contributes tomultidrug resistance in the nosocomial pathogen Enterobacter aerogenes.Antimicrob. Agents Chemother. 46, 2640–2643.
ouquette-Loughlin, C., Stojiljkovic, I., Hrobowski, T., Balthazar, J.T., Shafer,W.M., 2002. Inducible, but not constitutive, resistance of gonococci tohydrophobic agents due to the MtrC-MtrD-MtrE efflux pump requires TonB-ExbB-ExbD proteins. Antimicrob. Agents Chemother. 46, 561–565.
aier Jr., M.H., Tam, R., Reizer, A., Reizer, J., 1994. Two novel families of
PR
OO
F
binding prediction in the resistance-nodulation-cell division (RND)7.02.003
bacterial membrane proteins concerned with nodulation, cell division and 579
transport. Mol. Microbiol. 11, 841–847. 580
anchez, L., Pan, W., Vinas, M., Nikaido, H., 1997. The acrAB homolog of 581
Haemophilus influenzae codes for a functional multidrug efflux pump. J. 582
Bacteriol. 179, 6855–6857. 583