biochemistry - macromolecular...

61
Crystal Structure of YkfC Crystal structure of the -D-glutamyl-L-diamino acid endopeptidase YkfC from Bacillus cereus in complex with L-Ala--D-Glu: insights into substrate recognition by NlpC/P60 cysteine peptidases Qingping Xu 1,2 , Polat Abdubek 1,3 , Tamara Astakhova 1,4 , Herbert L. Axelrod 1,2 , Constantina Bakolitsa 1,5 , Xiaohui Cai 1,6 , Dennis Carlton 1,7 , Connie Chen 1,3 , Hsiu-Ju Chiu 1,2 , Michelle Chiu 1,3 , Thomas Clayton 1,7 , Debanu Das 1,2 , Marc C. Deller 1,7 , Lian Duan 1,4 , Kyle Ellrott 1,6 , Carol L. Farr 1,7 , Julie Feuerhelm 1,3 , Joanna C. Grant 1,3 , Anna Grzechnik 1,7 , Gye Won Han 1,7 , Lukasz Jaroszewski 1,4,5 , Kevin K. Jin 1,2 , Heath E. Klock 1,3 , Mark W. Knuth 1,3 , Piotr Kozbial 1,5 , S. Sri Krishna 1,4,5 , Abhinav Kumar 1,2 , David Marciano 1,7 , Mitchell D. Miller 1,2 , Andrew T. Morse 1,4 , Edward Nigoghossian 1,3 , Amanda Nopakun 1,7 , Linda Okach 1,3 , Christina Puckett 1,3 , Ron Reyes 1,2 , Henry J. Tien 1,7 , Christine B. Trame 1,2 , Henry van den Bedem 1,2 , Dana Weekes 1,5 , Tiffany Wooten 1,3 , Andrew Yeh 1,2 , Keith O. Hodgson 1,8 , John Structure of endopeptidase YkfC 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 2 3

Upload: dangquynh

Post on 18-Jul-2019

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Crystal Structure of YkfC

Crystal structure of the -D-glutamyl-L-diamino acid endopeptidase

YkfC from Bacillus cereus in complex with L-Ala--D-Glu: insights

into substrate recognition by NlpC/P60 cysteine peptidases

Qingping Xu1,2, Polat Abdubek1,3, Tamara Astakhova1,4, Herbert L. Axelrod1,2, Constantina

Bakolitsa1,5, Xiaohui Cai1,6, Dennis Carlton1,7, Connie Chen1,3, Hsiu-Ju Chiu1,2, Michelle Chiu1,3,

Thomas Clayton1,7, Debanu Das1,2, Marc C. Deller1,7, Lian Duan1,4, Kyle Ellrott1,6, Carol L. Farr1,7,

Julie Feuerhelm1,3, Joanna C. Grant1,3, Anna Grzechnik1,7, Gye Won Han1,7, Lukasz

Jaroszewski1,4,5, Kevin K. Jin1,2, Heath E. Klock1,3, Mark W. Knuth1,3, Piotr Kozbial1,5, S. Sri

Krishna1,4,5, Abhinav Kumar1,2, David Marciano1,7, Mitchell D. Miller1,2, Andrew T. Morse1,4,

Edward Nigoghossian1,3, Amanda Nopakun1,7, Linda Okach1,3, Christina Puckett1,3, Ron Reyes1,2,

Henry J. Tien1,7, Christine B. Trame1,2, Henry van den Bedem1,2, Dana Weekes1,5, Tiffany

Wooten1,3, Andrew Yeh1,2, Keith O. Hodgson1,8, John Wooley1,4, Marc-André Elsliger1,7, Ashley

M. Deacon1,2, Adam Godzik1,4,5, Scott A. Lesley1,3,7, and Ian A. Wilson1,7,*

1Joint Center for Structural Genomics, 2Stanford Synchrotron Radiation Lightsource, SLAC

National Accelerator Laboratory, Menlo Park, California 94025, 3Protein Sciences Department,

Genomics Institute of the Novartis Research Foundation, San Diego, California 92121, 4Center

for Research in Biological Systems, University of California, San Diego, La Jolla, California

92093, 5Program on Bioinformatics and Systems Biology, Burnham Institute for Medical

Research, La Jolla, California 92037, 6University of California, San Diego, La Jolla, California

92093, 7Department of Molecular Biology, The Scripps Research Institute, La Jolla, California

92037, 8Photon Science, SLAC National Accelerator Laboratory, Menlo Park, California 94025

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

15

16

17

18

19

20

21

22

2

3

Page 2: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Crystal Structure of YkfC

* To whom correspondence should be addressed. The Scripps Research Institute, BCC206,

10550 North Torrey Pines Road, La Jolla, CA 92037. E-mail: [email protected]

Structure of endopeptidase YkfC

1

1

2

3

4

5

2

3

Page 3: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 3

Abstract

Dipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus subtilis were

characterized previously as highly specific -D-glutamyl-L-diamino acid endopeptidases. We

determined the crystal structure of the YkfC orthologue from Bacillus cereus in complex with L-

Alanine--D-glutamate at 1.8 Å resolution. YkfC contains two N-terminal bacterial SH3

domains (SH3b) and a C-terminal catalytic domain, which is a member of a very large family of

cell-wall related cysteine peptidases, known as NlpC/P60 domains. A bound reaction product (L-

Ala--D-Glu) allows the identification of conserved sequence and structural signatures, which

recognize L-Ala and -D-Glu. The structure provides a clear framework for understanding

previous experimental results, including the substrate specificity observed in the dipeptidyl-

peptidase VI and YkfC. The first SH3b domain plays an important role in defining substrate

specificity such that only murein peptides with a free N-terminal alanine are allowed. A

conserved acidic residue present on the NlpC/P60 domain interacts with the amine group of the

alanine. This structural feature of the catalytic domain allows us to define a subfamily of

NlpC/P60 enzymes with the same substrate requirement, including a previously characterized

cyanobacterial L-Alanine--D-glutamate endopeptidase that is essentially a substructure of

YkfC.

Introduction

Cell-wall turnover, which refers to the loss of peptidoglycan components, were reported in

many bacteria, such as Escherichia coli and Bacillus subtilis (1). The products of the turnover

are generally reutilized through a process known as peptidoglycan (PG) recycling (2). The

molecular processes involved in cell-wall turnover and recycling are not currently well

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 4: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 4

understood in comparison to cell-wall synthesis, particularly in bacteria other than E. coli (2-6).

In E. coli, the cell-wall is degraded by lytic transglycosylases to release anhydromuropeptides

(GlcNAc-anhMurNAc-L-Ala--D-Glu-DAP-D-Ala, DAP=meso-diaminopimelic acid). The

anhydromuropeptides are imported into the cytoplasm primarily by AmpG permease, and

subsequently processed by N-acetyl-anhydromuramyl-L-Alanine amidase (cleaves the bond

between GlcNAc-anhMurNAc and L-Ala) and LD-carboxypeptidase LdcA (cleaves the bond

between DAP and D-Ala). There are two possible fates for the generated murein tripeptide L-

Ala--D-Glu-DAP. Under normal growth conditions, these tripeptides are recycled by the Mpl

ligase and returned to the peptidoglycan biosynthetic pathway. An additional pathway (Figure 1)

is involved in the murein tripeptide metabolism in E. coli, probably during nutrient-limiting

conditions (6). The MpaA endopeptidase, a metallocarboxypeptidase, specifically cleaves the -

D-Glu-DAP bond to produce L-Ala--D-Glu and DAP. The L-Ala--D-Glu is then converted to

L-Ala-L-Glu and, subsequently, to L-Ala and L-Glu by the YcjG epimerase and the PepD

peptidase respectively.

Some of the enzymes in cell-wall cycling of E. coli, such as AmpG, AmpD, Mpl and MpaA,

have no homologs in B. subtilis (2), suggesting the mechanism for cell-wall recycling may differ

in details between the two bacteria. It was proposed that the PG is cleaved by a muramidase and

an amidase to produce GlcNAc-MurNAc and stem peptides (2). The free peptides are imported

into the cytoplasm by an unidentified permease and subsequently processed by YkfABC

enzymes. YkfA, an LD-carboxypeptidase, removes the terminal D-Ala. The generated tripeptide

is further metabolized by a pathway which is functionally equivalent to that of E. coli with YkfC

as the -D-Glu-DAP endopeptidase and YkfB as the L-Ala-D-Glu epimerase (Figure 1) (7).

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

2

3

Page 5: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 5

Interestingly, while YkfB is homologous to YcjG of E. coli, YkfC is not homologous to MpaA

despite an equivalent function.

YkfC contains a C-terminal NlpC/P60 cysteine peptidase domain. NlpC/P60 is a large family

of cell-wall related cysteine peptidases that are broadly distributed in bacteria, virus, archaea and

eukaryotes (8). Currently characterized NlpC/P60 enzymes are almost all -D-Glu-DAP(Lys)

endopeptidases. While their biochemical function seems to be conserved, the physiological roles

of NlpC/P60 proteins are diverse, including cell separation, expansion, differentiation, cell-wall

turnover, cell lysis, protein secretion and virus infection (9). Secreted NlpC/P60 proteins also

have other roles in pathogenesis. The autolysin P60 of Listeria monocytogenes is involved in

host cell invasion (10). Enterotoxin FM of Bacillus cereus is involved in food poisoning (11).

SagA of Enterococcus faecium is a secreted antigen which binds to extracellular matrix proteins

(12).

NlpC/P60 proteins could be lethal to the bacteria due to their ability to compromise cell-wall

integrity or cell-wall biosynthesis. Therefore, their activities are tightly controlled through

multiple mechanisms (9): the expression of these proteins is regulated at the transcription level;

they are localized dependent on their physiological roles; their atomic level structures are

modified to precisely define substrate specificity (13). NlpC/P60 proteins are often fused to

auxiliary domains, many of which are known cell-wall binding modules (e.g. LysM and choline

binding domain). Thus, it is generally assumed that these auxiliary domains function as targeting

domains, which localize their proteins to the cell-wall. The functional synergy between the

NlpC/P60 domains and their auxiliary domains are currently not fully understood. We previously

determined the crystal structure of a -D-Glu-DAP endopeptidase from cyanobacteria

(AvPCP/NpPCP, Anabaena variabilis/Nostoc punctiforme PG cysteine peptidase) (13), which

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 6: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 6

consists of an N-terminal bacterial SH3 domain (SH3b) and a C-terminal NlpC/P60 domain. We

proposed that the SH3b domain of this enzyme is important in defining the substrate specificity

of the peptidase domain. However, the mechanism of substrate recognition by the NlpC/P60 and

SH3b was not firmly established. Here, we report the crystal structure of YkfC from B. cereus

(BcYkfC) in complex with L-Ala--D-Glu. BcYkfC is a close homolog to the YkfC protein

from B. subtilis that was previously biochemically characterized (7). Thus, we have a first

detailed view of substrate recognition by an NlpC/P60 protein.

Experimental Details

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

2

3

Page 7: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 7

Sequence analysis. Homologs of BcYkfC is obtained using PSI-BLAST (14) (3 iterations)

against the non-redundant (nr) protein sequence database at National Center for Biotechnology

Information (NCBI), using the sequence of catalytic domain of BcYkfC as the probe (residues

210-333). Alignment length (>=70) and E-value (<=0.02) are used as a criterion to extract a

subset of hits. These proteins (2599 sequences) were aligned using HMMALIGN (15) against the

ls profile model of NlpC/P60 domains from the PFAM database (v23, PF00877) (16). The YkfC

subfamily candidates were extracted from the above aligned subset based on the presence of an

aspartate at position Asp256 of BcYkfC. The full-length sequences were then aligned and

clustered using PIPEALIGN (17). Only sequences with a conserved tyrosine at position Tyr118

of BcYkfC were classified into YkfC subfamily. The phylogenetic tree was constructed using

the MAFFT (18).

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

2

3

Page 8: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 8

Protein expression and purification. Clones were generated using the Polymerase Incomplete

Primer Extension (PIPE) cloning method (19). The gene encoding BcYkfC (GenBank:

NP_979181; Swiss-Prot: Q736M3) was amplified by polymerase chain reaction (PCR) from

Bacillus cereus NRS248 ATCC 10987 genomic DNA using PfuTurbo DNA polymerase

(Stratagene) and I-PIPE (Insert) primers (forward primer: 5’-

ctgtacttccagggcGAAGAGAAGAAAGATAGTAAGGCGT-3’, reverse primer: 5’-

aattaagtcgcgttaAGGTAAGTAACGACGCGCACCAGCG-3’, target sequence in upper case) that

included sequences for the predicted 5' and 3' ends. The expression vector, pSpeedET, which

encodes an amino-terminal tobacco etch virus (TEV) protease-cleavable expression and

purification tag (MGSDKIHHHHHHENLYFQ/G), was PCR amplified with V-PIPE (Vector)

primers (forward primer: 5’-taacgcgacttaattaactcgtttaaacggtctccagc-3’, reverse primer: 5’-

gccctggaagtacaggttttcgtgatgatgatgatgatg-3’). V-PIPE and I-PIPE PCR products were mixed to

anneal the amplified DNA fragments together. Escherichia coli GeneHogs (Invitrogen)

competent cells were transformed with the I-PIPE / V-PIPE mixture and dispensed on selective

LB-agar plates. The cloning junctions were confirmed by DNA sequencing. Using the PIPE

method, the gene segment encoding residues A1-S23 were deleted. Expression was performed in

a selenomethionine-containing medium with suppression of normal methionine synthesis. At the

end of fermentation, lysozyme was added to the culture to a final concentration of 250 μg/ml,

and the cells were harvested and frozen. After one freeze/thaw cycle the cells were sonicated in

lysis buffer [50mM HEPES pH 8.0, 50mM NaCl, 10mM imidazole, 1mM Tris(2-

carboxyethyl)phosphine-HCl (TCEP)] and the lysate was clarified by centrifugation at 32,500g

for 30min. The soluble fraction was passed over nickel-chelating resin (GE Healthcare) pre-

equilibrated with lysis buffer, the resin washed with wash buffer [50mM HEPES pH 8.0, 300mM

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 9: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 9

NaCl, 40mM imidazole, 10% (v/v) glycerol, 1mM TCEP], and the protein eluted with elution

buffer [20mM HEPES pH 8.0, 300mM imidazole, 10% (v/v) glycerol, 1mM TCEP]. The eluate

was buffer exchanged with TEV buffer [20mM HEPES pH 8.0, 200mM NaCl, 40mM imidazole,

1mM TCEP] using a PD-10 column (GE Healthcare), and incubated with 1mg of TEV protease

per 15 mg of eluted protein. The protease-treated eluate was run over nickel-chelating resin (GE

Healthcare) pre-equilibrated with HEPES crystallization buffer [20mM HEPES pH 8.0, 200mM

NaCl, 40mM imidazole, 1mM TCEP] and the resin was washed with the same buffer. The flow-

through and wash fractions were combined and concentrated for crystallization trials to 18.8

mg/ml by centrifugal ultrafiltration (Millipore).

Oligomeric state. The oligomeric state of BcYkfC was determined using a 0.8 x 30cm2 Shodex

Protein KW-803 column (Thomson Instruments) pre-calibrated with gel filtration standards (Bio-

Rad). BcYkfC is likely a monomer in solution, supported by crystal packing analysis and

analytical size exclusion chromatography.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

2

3

Page 10: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 10

Crystallization. BcYkfC was crystallized by mixing 200nl protein with 200nl crystallization

solution and with a 50μl reservoir volume using the nanodroplet vapor diffusion method (20)

with standard JCSG crystallization protocols (21). The crystallization reagent that produced the

BcYfkC crystal used for structure solution contained 0.2M sodium chloride, 50% PEG-200, and

0.1M phosphate citrate pH 4.2. A needle-shaped crystal of approximate size 100 μm x 15 μm x

15 μm was harvested after 29 days at 277K. No additional cryoprotectant was added to the

crystal. Initial screening for diffraction was carried out using the Stanford Automated Mounting

system (SAM) (22) at the Stanford Synchrotron Radiation Lightsource (SSRL, Menlo Park, CA).

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

2

3

Page 11: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 11

Data collection, structure solution, and refinement. Multi-wavelength anomalous diffraction

(MAD) data were collected at SSRL beamline 11-1. Data were collected at wavelengths

corresponding to the peak, high energy remote, and inflection of a selenium MAD experiment at

100 K using Mar CCD 325 detector (Rayonix). Data processing and structure solution from

initial diffraction images were carried out using an automatic structure solution script

(autoXDSp) developed at the JCSG. This script shepherds the structural determination process

described below using a few preset rules mimicking decision-making of a crystallographer. The

calculations were parallelized on a PC cluster such that an initial map/model can be usually

obtained within an hour after the completion of data collection. The MAD data were integrated

and reduced using XDS and then scaled with the program XSCALE (23). Selenium sites were

located with SHELXD (24). Phase refinement and automatic model building were performed

using autoSHARP (25) and wARP (26). The above automatic process produced an initial model

that is 92% complete. Further model completion and refinement were performed manually with

COOT (27) and REFMAC (28) of the CCP4 suite (29). Data and refinement statistics are

summarized in Table 1. Analysis of the stereochemical quality of the model was accomplished

using MolProbity (30). All molecular graphics were prepared with PyMOL (DeLano Scientific)

unless specifically stated otherwise.

Molecular modeling. Molecular docking was performed using the same protocol as described

previously (13) using Glide 5.0 (Schrödinger LLC, Portland, OR). The positions of the bound

ligand were used as restraints such that the L-Ala--D-Glu portion of the docked substrate

adopts a similar conformation as seen in the crystal structure. The catalytic cysteine is in reduced

form for the docking studies. Only main conformers were used if multiple conformations were

observed for the active site residues.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 12: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 12

Accession code. Atomic coordinates and experimental structure factors for BcYkfC at 1.8 Å have

been deposited in the PDB under accession code 3h41.

Results and Discussion

Genomic context. The full-length BcYkfC (strain B. cereus ATCC 10987) contains 333 residues

(Molecular weight 37.3 kD). The N-terminal 23 residues of BcYkfC were predicted to be a

signal peptide (not a membrane anchoring helix) by the Phobius program (31). B. subtilis YkfC,

despite being highly homologous to BcYkfC (40% sequence identity), does not contain a signal

peptide (length 296 aa), suggesting that the two proteins function at different cellular locations.

As in B. subtilis, the ykfC gene (BCE_2878) is adjacent to the ykfB L-Ala-D-Glu epimerase

gene (BCE_2879) in the B. cereus genome. The conserved genome association of ykfB and ykfC

is also observed in other bacteria (e.g. Listeria innocua, Acidobacterium sp., Solibacter usitatus,

Bacteroides thetaiotaomicron, Gramella forsetii). However, the genome context of the ykfBC is

different in B. subtilis compared with B. cereus (Figure 2). The YkfA-D genes of B. subtilis are

located next to the dppB-E dipeptide ABC transporter operon, whereas the YkfBC genes of B.

cereus are located downstream of divIC, which encodes a putative cell division protein. The

downstream of ykfC is an oppA gene that is homologous to dppE (34% sequence identity). Both

genes are predicted to encode periplasmic dipeptide binding proteins.

Interestingly, YkfB epimerases of both B. subtilis and B. cereus (seq id 50%) do not have

signal peptides. The differences in signal peptides in YkfC B. subtilis and B. cereus and their

genomic contexts suggest that the strategy for metabolizing murein peptides likely differs in the

two bacteria. In B. cereus, the murein peptides are likely broken down outside the cell with

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

2

3

adeacon, 08/13/09,
is their a closer homolog of BsYkfC in Bc??? Or a closer homolog of our BcYkfC in Bs???
Jimmy, 08/13/09,
no
Page 13: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 13

resulting dipeptide L-Ala--D-Glu imported into cytoplasm for further processing by YkfB,

while both reactions likely occur in the cytoplasm in B. subtilis.

Structure determination and quality of the model. The crystal structure of BcYkfC was

determined using the high-throughput structural genomics pipeline implemented at the Joint

Center for Structural Genomics (JCSG) (21). The selenomethionine derivative of BcYkfC was

expressed in E. coli with an N-terminal TEV cleavable His-tag and purified by metal affinity

chromatography. The predicted N-terminal signal peptide (residues 1-23) was not included in the

cloned construct. The structure was solved in space group C2 at 1.79 Å resolution with one

molecule per asymmetric unit using the MAD method (Rfactor=16.3/Rfree=19.7). The main chain

density is excellent throughout the whole molecule. The mean residual error of the coordinates

was estimated at 0.11 Å by a diffraction-component precision index method (DPI) (32). The

model of BcYkfC displays good geometry with an all-atom clash score of 5.07, and the

Ramachandran plot produced by MolProbity (30) shows that all residues but one are in allowed

regions, with 97.7 % in favored regions. The side chains of the model are also well refined, only

1 residue flagged by MolProbity as a rotamer outlier. The Ramachandran outlier (Pro305) and

rotamer outlier (His303) are supported by well defined electron density. Since these residues are

either close to or part of the active site, these structural anomalies are likely of functional

relevance. Additionally, two cis-peptides (Asn57-Pro58 and Asn154-Pro155) are also supported

by clear electron density. The final model of BcYkfC contains residues 29 to 333, one dipeptide

L-Ala--D-Glu, one phosphate, 6 polyethylene glycol (PEG) fragments from the crystallization

solution and 265 waters. The residual of the N-terminal residual purification tag (Gly0), residues

24-28, as well as side chains for residues Glu201 and Arg270, are disordered and were not

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

2

3

Page 14: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 14

included in the final model. Data collection, refinement and model statistics are summarized in

Table 1.

Overall structure. The structure of BcYkfC consists of three domains, including two bacterial

SH3 domains (SH3b) (SH3b1, residues 29-129; SH3b2, residues 130-207), and a C-terminal

NlpC/P60 cysteine peptidase domain (residues 208-333) (Figure 3A). The structures of the two

SH3b domains are similar to each other with an rmsd of 2.2 Å for 57 aligned C atoms (sequence

identity 13%). SH3b1 contains a helical insertion (1-3) compared to SH3b2 (Figure 3B). A

structural similarity search using Dali (33) indicated that there are no previously known

structures with the same three-domain architecture. However, significantly similar substructures

were identified. A summary of structural comparisons is present in Table 2. The SH3b domains

are similar to several known SH3bs (33), including the N-terminal domain of cyanobacteria -D-

Glu-DAP endopeptidases (13), the GW domains of Internalin B (34), the cell wall-targeting

domain of glycylglycine endopeptidase ALE-1 (35), the PhnA-like protein (36), the endolysin

PlyPSA (37), as well as many eukaryotic SH3 domains. A -hairpin within the RT loop region

(between A and B) appears to be a common and unique structural feature among SH3bs

compared with their eukaryotic counterparts (13). Both SH3bs of BcYkfC also have this

conserved structural motif (A1-A2). The C-terminal NlpC/P60 catalytic domain of BcYkfC is

remotely related to the papain family of cysteine peptidases (8). It is most similar to

cyanobacteria -D-Glu-DAP endopeptidases (13), E. coli lipoprotein Spr (38), and two

uncharacterized proteins (PDB codes: 2p1g and 2im9).

The three domains of BcYkfC are arranged in a triangular manner such that each domain

interacts with two other domains. The domain interface between the two SH3b domains is

mostly hydrophobic. The interface (~570 Å2 per domain) centers at the A2-B loop of SH3b1

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 15: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 15

and the A2-B region of SH3b2. The SH3b1/NlpC/P60 domain interface is mediated through

the 2-3-A1 region and C-D loop of SH3b1 and the 1-2-3 region and the 6-7 loop

of NlpC/P60 and buries a surface area of 903 Å2 per domain. The active site is located at the

domain interface of SH3b1 and NlpC/P60. SH3b2 domain is located away from the active site,

thus does not contribute to the active site. A multiple sequence alignment of full length homologs

of YkfC (37 sequences, average seq id 54%) indicated that most of the highly conserved residues

are either buried inside the protein or are clustered around the active site (Figure 3C).

Active site. The catalytic triad on the peptidase domain consists of Cys238, His291 and His303.

Cys238 appears tri-oxidized based on the electron density. The conformation of the cysteine side

chain is not significantly affected by the oxidation as its side chain adopts a similar conformation

as seen in other NlpC/P60 structures (13).

A dipeptide L-Ala--D-Glu was identified based on well-defined electron density in the

active site of BcYkfC (Figure 4A). The dipeptide was not added during protein purification or

crystallization, thus it is most likely to have been obtained from the expression host E. coli. Since

this dipeptide is one of the reaction products of BcYkfC (Figure 2), it unequivocally identifies

the S1-2 binding site cavity, which is formed by residues from both the SH3b1 domain (Glu83,

Thr84 and Tyr118) and the NlpC/P60 domain (Tyr226, Trp228, Ala229, Asp237, Arg255,

Asp256, Ser257, His290, and His291). Two charged residues, Asp237 and Arg255, which are

highly conserved in NlpC/P60 domains, are involved in a hydrogen bond network connecting

many residues in the active site (Figure 4B).

Two of the residues in the active site, Ser257 and His291, displayed discrete side chain

disorder. The first conformer of His291 (occupancy modeled as 0.7), which allows hydrogen

bonding with His303, is identical to the corresponding histidine residue of the catalytic dyad in

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 16: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 16

papain and other NlpC/P60 enzymes. The second conformation of His291 points towards the

solvent. The side chain disorder in the active site is likely an artifact of the oxidized state of the

catalytic Cys238, since the oxidized Cys238 makes multiple hydrogen bond interactions with

nearby side chains, including Tyr226, Ser257 and His291 (Figure 4B).

Recognition of L-Ala--D-Glu. The active site residues form a pocket that is highly

complementary in shape and molecular properties with the ligand (Figure 4C-D). The average B-

factor for the bound ligand is 26 Å2 (overall B factor of the protein 17.8 Å2). The interface

between the dipeptide and the protein buries a total surface area of 490 Å2. The dipeptide is

stabilized by multiple hydrogen bonds. The free amine group of the L-Ala makes hydrogen

bonds with Glu83O, Tyr118OH, and Asp256O1. The -NH group of the -D-Glu forms a weak

H-bond with Asp237O2. The -carboxyl of D-Glu is stabilized by hydrogen bonding

interactions with Ser239 and Ser257. On the solvent-exposed side of the substrate, waters are

also involved in the hydrogen bond network with the substrate and the enzyme (not shown). The

side chain of the L-Ala points towards a hydrophobic region defined by the side chains of

Trp228 and Ala229. Additionally, the amphipathic carbons of D-Glu are involved in

hydrophobic contacts with Trp228 side chain. As a result, Trp228 contributes to both the S2 and

S1 binding sites.

YkfC is specific for murein peptides with free N-terminal L-Ala. The active site of B. subtilis

YkfC is highly conserved compared to that of BcYkfC (Figure 5). Dipeptidyl-peptidase VI (DPP

VI) of B. sphaericus is an -D-Glu-DAP(Lys) dipeptidase that has been found in the cell

cytoplasm during sporulation (39). It has strict specificity for murein peptides with a free N-

terminal L-Ala. The crystal structure of BcYkfC provides a structural basis for this specificity.

DPP VI is 21% sequence identical to BcYkfC with conserved residues distributed across the full-

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 17: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 17

length of the protein (Figure 5). These residues are important for correct folding or function. As a

result, DPP VI is a homolog of BcYkfC except that the SH3b1 domain of DPP VI has no helical

insertion (1-3) between A1 and A2 as well as shorter C-D loop (Figure 5). Furthermore,

the S2 and S1 binding sites for L-Ala--D-Glu are highly conserved between DPP VI and

BcYkfC. DPP VI possesses the conserved tyrosine (Tyr118 of YkfC) that is the only residue

from SH3b1 whose side chain interacts with L-Ala. The residues from the catalytic domain that

contribute to S2 and S1 sites are identical between BcYkfC and DPP VI (except an A/S mutation

at position 257 of BcYkfC). As a result, we conclude that both B. subtilis and B. cereus YkfCs

have the same substrate specificity for murein peptides with a free N-terminal L-Ala. Enzymes

with this specificity can not cleave PG biosynthesis precursors such as UDP-MurNAc-

pentapeptide, thus are not likely to interfere with the PG synthesis pathway.

S’ sites of YkfC and docking studies. B. subtilis YkfC was shown previously to process the

tripeptide (L-Ala--D-Glu-L-Lys), the tetrapeptide (L-Ala--D-Glu-L-Lys-D-Ala) and the

pentapeptide (L-Ala--D-Glu-L-Lys-D-Ala-D-Ala) (7). DPP VI also displayed no specificity

towards additional residues attached to DAP (or Lys) (39). The residues in BcYkfC that

potentially form S’ sites include His290, His291, Arg306, Glu308, Arg309, Tyr321 and Glu324.

These residues are generally less conserved among close BcYkfC homologs (Figure 3),

suggesting that the specificity of YkfC is determined primarily at the S sites as described above.

The catalytic mechanism of NlpC/P60 is currently unknown, but it is likely to be similar to

papain based on a similar arrangement of their catalytic residues (8, 13, 38). The third polar

residue of the catalytic triad in NlpC/P60 is more often a histidine instead of an asparagine as

found in papain (16). Furthermore, a tyrosine (Tyr226 of BcYkfC) is equivalent to a glutamine

(Gln119) in papain that is thought to be important for substrate binding during catalysis. We

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 18: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 18

docked tripeptide L-Ala--D-Glu-L-DAP to the active site of BcYkfC. Figure 6A shows a

possible mode of tripeptide interaction with the enzyme. The C end of -D-Glu is displaced due

to oxidation of Cys238 in the crystal structure. In the modeled structure, Tyr226OH interacts

with the carbonyl group of the -D-Glu (Figure 6B). This conformation places the C atom close

to Cys238SH (distance ~3.6 Å), likely similar to a productive conformation. The electrostatic

surface of the binding site is complementary to that of the substrate, indicating that electrostatic

interaction is likely important for substrate recognition. Basic residues (His290, His291 and

Arg306) could interact with carboxyl groups. It has been previously reported that the catalytic

efficiency of B. subtilis YkfC for L-Ala--D-Glu- Lys is significant less than that of tetra- or

penta- peptides (7). This could be explained by unfavorable environment (near Arg306/Glu308,

Lys/Gly for B. subtilis YkfC) for accommodating the positive charge on Lys.

Recognition of -D-Glu by NlpC/P60 homologs. NlpC/P60 cysteine peptidases are a ubiquitous

peptidase in bacteria, there are ~5,000 NlpC/P60 domains in the current Pfam database when

metagenomics data are included (16). However, only a few of these domains have been

biochemically characterized. In light of the clearer picture we have of substrate binding in

BcYkfC, we examined the sequence conservation of NlpC/P60 family proteins around the active

site in order to gain further insights into substrate specificity across the entire family.

The non-redundant (nr) database of protein sequences was searched using the PSI-BLAST

program (14) with the NlpC/P60 domain of BcYkfC as a search probe. We extracted a total of

2599 protein sequences using a criteria of E-value >0.02 and alignment length >70 residues from

3985 total hits, representing a significant subset of NlpC/P60 proteins. Among these proteins,

~1% do not possess the conserved dyad Cys/His, an additional 9% do not contain the conserved

tyrosine that is believed to be essential. These proteins are likely to have similar fold, but either

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 19: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 19

have lost cysteine peptidase activity, or belong to remote homologs that have different substrate

specificity (e.g. CHAP family) (8, 40, 41). The sequence conservation pattern of active site

residues of the remaining 2277 proteins (average seq id 19%) is shown in Figure 6C. The S sites

(S2 and S1) are significantly more conserved than the S’ sites in these proteins. The most

conserved site is S1 that is tailored to recognize -D-Glu specifically. The three most highly

conserved residues (Asp237, Ser239 and Tyr226), aside from the catalytic dyad, are involved

with hydrogen bonding with -D-Glu (Figure 6B). At position Ser257, residues with small side

chains (A/S/T) are observed, creating a cavity for the carboxyl group of -D-Glu. Trp228

contributes to both the S1 and S2 sites and is conserved as a hydrophobic residue, which may

facilitate the interaction with C of Ala. Therefore, the conservation of active site residues

indicates that a significant percentage of the NlpC/P60 family is likely to be specific for -D-

Glu. More than 1300 proteins, which contain the strictly conserved Trp228, Asp237, Ser239,

Tyr226 residues (and catalytic triads), most likely bind X-L-Ala--D-Glu moieties (X=H or

other moieties).

Structural comparison to the cyanobacterial NlpC/P60 L-Ala--D-Glu endopeptidases. We

previously determined crystal structures of two closely-related NlpC/P60 L-Ala--D-Glu

endopeptidases from cyanobacteria A. variabilis (AvPCP) and N. punctiforme (NpPCP). These

enzymes contain an N-terminal SH3b domain and a C-terminal NlpC/P60 catalytic domain (13).

Surprisingly, the structures of the cyanobacterial enzymes are essentially a substructure of YkfC:

the two domains of the cyanobacteria enzymes are structurally equivalent to the first and third

domain of BcYkfC (Figure 5, 7A-B). The full-length AvPCP can be superimposed on to

BcYkfC, with an rmsd of 2.3 Å and 29% sequence identity for 193 aligned C atoms (Table 2).

Thus, the individual domains of SH3b and NlpC/P60 as well as their arrangements are highly

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Jimmy, 08/13/09,
CHAP & NlpC/P60 domains seem to overlap at the boundary of low seq id. CHAP domain have different substrate specificity.
adeacon, 08/13/09,
don't understand this statement.
Page 20: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 20

similarly in AvPCP and BcYkfC. Furthermore, the residues of the S2 and S1 pockets are nearly

identical (except for S257A, Figure 7C). The striking similarity between the cyanobacteria

enzymes and YkfC was previously not detected by sequence analysis due to the presence of two

insertions: the entire SH3b2 domain and the 1-3 region of SH3b1.

The C-D loop and the A1-A2 loop of SH3b1 are longer compared to SH3b of AvPCP.

The C-D loop of YkfC is located upstream of the S2 site. The A1-A2 loop of YkfC

contains three helices (1-3). It is located at the S’ side of the binding groove on the catalytic

domain. This insertion packs against the surface of NlpC/P60 domain and contributes to

stabilization of the interface between SH3b1 and NlpC/P60 domains. Furthermore, it is also

located at the S’ binding site and define its shape. Indeed, the shape of S’ sites of BcYkfC appear

more restricted than that of AvPCP at the S’ sites (Figure 7D). Without these insertions to

stabilize the domain interface between SH3b and NlpC/P60, AvPCP uses a different strategy to

achieve similar purpose: a longer C-terminus loop extends to interact with its SH3b domain.

We previously proposed that the cyanobacteria enzymes have the same substrate requirement

as DPP VI based on modeling studies (13). The crystal structure of YkfC reported here further

supports that DPP VI, YkfC and the cyanobacteria enzymes belong to a subset of NlpC/P60 -D-

Glu-DAP endopeptidases whose substrates possess an N-terminal free L-Ala. The SH3b domain

helps define the S2 binding site (Tyr118 in BcYkfC, Tyr64 in AvPCP). Additionally, it sterically

hinders docking of large moiety beyond S2 site. Thus, it directly contributes to specificity. This

role for SH3b constitutes a new function for SH3b domains.

Identification and distribution of YkfC homologs. As shown above, BcYkfC and cyanobacterial

-D-Glu-DAP endopeptidases have the same specificity. Similar NlpC/P60 enzymes can be

detected by sequence homology in both the SH3b and the NlpC/P60 domains. However, due to

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 21: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 21

the presence of long insertions/deletions and highly divergent sequences in the SH3b domains, it

is often difficult to detect similar enzymes using sequence searches, and the NlpC/P60 region

tends to dominate the hits. We examined an alternate method by examining specific residues on

the NlpC/P60 domain only. An acidic residue (Asp256) is essential for YkfC specificity by

neutralizing the positive charge on the free amine of the L-Ala. Among 2599 sequences obtained

above, 282 proteins were identified, which contained an aspartate equivalent to Asp256 and a

conserved catalytic dyad. 239 out of 282 proteins (85%) contained the conserved tyrosine on

their SH3b domains (Tyr118 of BcYkfC). Thus, the presence of aspartate residue at the position

Asp256 and a conserved tyrosine on SH3b is highly correlated (Figure 6D). Interestingly,

NlpC/P60 domains with the conserved aspartate are also found in a few single-domain or

multiple-domain proteins that do not contain a detectable SH3b domain. The biochemical

properties of these proteins are currently unknown; they could represent novel variants of a

YkfC-type enzyme.

Nonetheless, the 239 proteins above with the conserved aspartate (Asp256) and tyrosine

(Tyr118) most likely defines a YkfC-like subfamily of NlpC/P60 proteins. These NlpC/P60

enzymes are predominately distributed in four phyla of bacteria: bacteroidetes, cyanobacteria,

firmicute and proteobacteria-proteobacteria) (Figure 8). They include all currently known

enzymes with the same requirement of a free N-terminal L-Ala, i.e., DPP VI, YkfC, and

AvPCP/NpPCP. The cyanobacterial enzymes contain one SH3b domain, while homologs in

other bacteria contain two SH3b domains. As expected, the SH3b domains are highly divergent.

The D strand, where the conserved tyrosine is located, is more conserved (Figure 6D). This

strand is also responsible for the interface with the catalytic domain. BcYkfC is a member of a

cluster of highly conserved homologs that are present in closed related species such as Bacillus

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 22: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 22

anthracis, Bacillus thuringiensis, Bacillus weihenstephanensis (seq id >= 95%). All the

members of this group contain signal peptides. Signal peptides are also present in some

members of the bacteroidetes group, while no signal peptides are identified in proteobacteria and

cyanobacteria homologs.

It is likely that the YkfC subfamily of highly specialized enzymes is evolved from a common

ancestor, likely a P60-like general-purpose enzyme with two or more SH3b domains connected

to a C-terminal NlpC/P60 domain. The SH3b domains may have lost the function as targeting

domains. The nonessential SH3b domain is degraded, with the complete loss of the second SH3b

domain in the case of the cyanobacterial enzymes.

Conclusion

The structure of BcYkfC in complex with L-Ala--D-Glu is the first structural representative

of an NlpC/P60 enzyme with a bound ligand. It allows us to identify the determinants of

substrate specificity, which further leads to the classification of a subfamily of highly specialized

NlpC/P60 enzymes. It also provides structural and functional insights to the NlpC/P60 family of

enzymes in general.

Acknowledgements

The project is sponsored by National Institutes of General Medical Sciences Protein Structure

Initiative (U54 GM074898). Portions of this research were carried out at the Stanford

Synchrotron Radiation Lightsource (SSRL). The SSRL is a national user facility operated by

Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy

Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 23: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 23

Energy, Office of Biological and Environmental Research, and by the National Institutes of

Health (National Center for Research Resources, Biomedical Technology Program, and the

National Institute of General Medical Sciences). Genomic DNA from Bacillus cereus NRS248

(ATCC #10987D) was obtained from the American Type Culture Collection (ATCC). The

content is solely the responsibility of the authors and does not necessarily represent the official

views of the National Institute of General Medical Sciences or the National Institutes of Health.

References

1. Doyle, R. J., Chaloupka, J., and Vinter, V. (1988) Turnover of cell walls in

microorganisms, Microbiol. Rev. 52, 554-567.

2. Park, J. T., and Uehara, T. (2008) How bacteria consume their own exoskeletons

(turnover and recycling of cell wall peptidoglycan), Microbiol. Mol. Biol. Rev. 72, 211-

227.

3. Uehara, T., and Park, J. T. (2007) An anhydro-N-acetylmuramyl-L-alanine amidase with

broad specificity tethered to the outer membrane of Escherichia coli, J. Bacteriol. 189,

5634-5641.

4. Uehara, T., Suefuji, K., Valbuena, N., Meehan, B., Donegan, M., and Park, J. T. (2005)

Recycling of the anhydro-N-acetylmuramic acid derived from cell wall murein involves a

two-step conversion to N-acetylglucosamine-phosphate, J. Bacteriol. 187, 3643-3649.

5. Uehara, T., and Park, J. T. (2004) The N-acetyl-D-glucosamine kinase of Escherichia

coli and its role in murein recycling, J. Bacteriol. 186, 7273-7279.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

2

3

Page 24: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 24

6. Uehara, T., and Park, J. T. (2003) Identification of MpaA, an amidase in Escherichia coli

that hydrolyzes the -D-glutamyl-meso-diaminopimelate bond in murein peptides, J.

Bacteriol. 185, 679-682.

7. Schmidt, D. M., Hubbard, B. K., and Gerlt, J. A. (2001) Evolution of enzymatic activities

in the enolase superfamily: functional assignment of unknown proteins in Bacillus

subtilis and Escherichia coli as L-Ala-D/L-Glu epimerases, Biochemistry 40, 15707-

15715.

8. Anantharaman, V., and Aravind, L. (2003) Evolutionary history, structural features and

biochemical diversity of the NlpC/P60 superfamily of enzymes, Genome Biol. 4, R11.

9. Smith, T. J., Blackman, S. A., and Foster, S. J. (2000) Autolysins of Bacillus subtilis:

multiple enzymes with multiple functions, Microbiology 146(2), 249-262.

10. Kuhn, M., and Goebel, W. (1989) Identification of an extracellular protein of Listeria

monocytogenes possibly involved in intracellular uptake by mammalian cells, Infect.

Immun. 57, 55-61.

11. Asano, S. I., Nukumizu, Y., Bando, H., Iizuka, T., and Yamamoto, T. (1997) Cloning of

novel enterotoxin genes from Bacillus cereus and Bacillus thuringiensis, Appl. Environ.

Microbiol. 63, 1054-1057.

12. Teng, F., Kawalec, M., Weinstock, G. M., Hryniewicz, W., and Murray, B. E. (2003) An

Enterococcus faecium secreted antigen, SagA, exhibits broad-spectrum binding to

extracellular matrix proteins and appears essential for E. faecium growth, Infect. Immun.

71, 5033-5041.

13. Xu, Q., Sudek, S., McMullan, D., Miller, M. D., Geierstanger, B., Jones, D. H., Krishna,

S. S., Spraggon, G., Bursalay, B., Abdubek, P., Acosta, C., Ambing, E., Astakhova, T.,

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 25: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 25

Axelrod, H. L., Carlton, D., Caruthers, J., Chiu, H. J., Clayton, T., Deller, M. C., Duan,

L., Elias, Y., Elsliger, M. A., Feuerhelm, J., Grzechnik, S. K., Hale, J., Won Han, G.,

Haugen, J., Jaroszewski, L., Jin, K. K., Klock, H. E., Knuth, M. W., Kozbial, P., Kumar,

A., Marciano, D., Morse, A. T., Nigoghossian, E., Okach, L., Oommachen, S., Paulsen,

J., Reyes, R., Rife, C. L., Trout, C. V., van den Bedem, H., Weekes, D., White, A., Wolf,

G., Zubieta, C., Hodgson, K. O., Wooley, J., Deacon, A. M., Godzik, A., Lesley, S. A.,

and Wilson, I. A. (2009) Structural basis of murein peptide specificity of a -D-glutamyl-

L-diamino acid endopeptidase, Structure 17, 303-313.

14. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and

Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein

database search programs, Nucleic Acids Res. 25, 3389-3402.

15. Eddy, S. R. (1998) Profile hidden Markov models, Bioinformatics 14, 755-763.

16. Bateman, A., Coin, L., Durbin, R., Finn, R. D., Hollich, V., Griffiths-Jones, S., Khanna,

A., Marshall, M., Moxon, S., Sonnhammer, E. L., Studholme, D. J., Yeats, C., and Eddy,

S. R. (2004) The Pfam protein families database, Nucleic Acids Res. 32, D138-141.

17. Plewniak, F., Bianchetti, L., Brelivet, Y., Carles, A., Chalmel, F., Lecompte, O., Mochel,

T., Moulinier, L., Muller, A., Muller, J., Prigent, V., Ripp, R., Thierry, J. C., Thompson,

J. D., Wicker, N., and Poch, O. (2003) PipeAlign: A new toolkit for protein family

analysis, Nucleic Acids Res. 31, 3829-3832.

18. Katoh, K., Kuma, K., Toh, H., and Miyata, T. (2005) MAFFT version 5: improvement in

accuracy of multiple sequence alignment, Nucleic Acids Res. 33, 511-518.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

2

3

Page 26: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 26

19. Klock, H. E., Koesema, E. J., Knuth, M. W., and Lesley, S. A. (2008) Combining the

polymerase incomplete primer extension method for cloning and mutagenesis with

microscreening to accelerate structural genomics efforts, Proteins 71, 982-994.

20. Santarsiero, B. D., Yegian, D. T., Lee, C. C., Spraggon, G., Gu, J., Scheibe, D., Uber, D.

C., Cornell, E. W., Nordmeyer, R. A., Kolbe, W. F., Jin, J., Jones, A. L., Jaklevic, J. M.,

Schultz, P. G., and Stevens, R. C. (2002) An approach to rapid protein crystallization

using nanodroplets., J. Appl. Cryst. 35, 278-281.

21. Lesley, S. A., Kuhn, P., Godzik, A., Deacon, A. M., Mathews, I., Kreusch, A., Spraggon,

G., Klock, H. E., McMullan, D., Shin, T., Vincent, J., Robb, A., Brinen, L. S., Miller, M.

D., McPhillips, T. M., Miller, M. A., Scheibe, D., Canaves, J. M., Guda, C., Jaroszewski,

L., Selby, T. L., Elsliger, M. A., Wooley, J., Taylor, S. S., Hodgson, K. O., Wilson, I. A.,

Schultz, P. G., and Stevens, R. C. (2002) Structural genomics of the Thermotoga

maritima proteome implemented in a high-throughput structure determination pipeline,

Proc. Natl. Acad. Sci. USA 99, 11664-11669.

22. Cohen, A. E., Ellis, P. J., Miller, M. D., Deacon, A. M., and Phizackerley, R. P. (2002)

An automated system to mount cryo-cooled protein crystals on a synchrotron beamline,

using compact samples cassettes and a small-scale robot, J. Appl. Cryst. 35, 720-726.

23. Kabsch, W. (1993) Automatic processing of rotation diffraction data from crystals of

initially unknown symmetry and cell constants, J. Appl. Cryst. 26, 795-800.

24. Schneider, T. R., and Sheldrick, G. M. (2002) Substructure solution with SHELXD, Acta

Cryst. D 58, 1772-1779.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

2

3

Page 27: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 27

25. Bricogne, G., Vonrhein, C., Flensburg, C., Schiltz, M., and Paciorek, W. (2003)

Generation, representation and flow of phase information in structure determination:

recent developments in and around SHARP 2.0, Acta Cryst. D 59, 2023-2030.

26. Cohen, S. X., Morris, R. J., Fernandez, F. J., Ben Jelloul, M., Kakaris, M., Parthasarathy,

V., Lamzin, V. S., Kleywegt, G. J., and Perrakis, A. (2004) Towards complete validated

models in the next generation of ARP/wARP, Acta Cryst. D 60, 2222-2229.

27. Emsley, P., and Cowtan, K. (2004) Coot: model-building tools for molecular graphics,

Acta Cryst. D 60, 2126-2132.

28. Murshudov, G. N., Vagin, A. A., and Dodson, E. J. (1997) Refinement of

macromolecular structures by the maximum-likelihood method, Acta Cryst. D 53, 240-

255.

29. CCP4. (1994) The CCP4 suite: programs for protein crystallography, Acta Cryst. D 50,

760-763.

30. Davis, I. W., Murray, L. W., Richardson, J. S., and Richardson, D. C. (2004)

MOLPROBITY: structure validation and all-atom contact analysis for nucleic acids and

their complexes, Nucleic Acids Res. 32, W615-619.

31. Kall, L., Krogh, A., and Sonnhammer, E. L. (2007) Advantages of combined

transmembrane topology and signal peptide prediction--the Phobius web server, Nucleic

Acids Res. 35, W429-432.

32. Cruickshank, D. W. (1999) Remarks about protein structure precision, Acta Cryst. D 55,

583-601.

33. Holm, L., and Sander, C. (1995) Dali: a network tool for protein structure comparison,

Trends Biochem. Sci. 20, 478-480.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 28: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 28

34. Marino, M., Banerjee, M., Jonquieres, R., Cossart, P., and Ghosh, P. (2002) GW domains

of the Listeria monocytogenes invasion protein InlB are SH3-like and mediate binding to

host ligands, EMBO J. 21, 5623-5634.

35. Lu, J. Z., Fujiwara, T., Komatsuzawa, H., Sugai, M., and Sakon, J. (2006) Cell wall-

targeting domain of glycylglycine endopeptidase distinguishes among peptidoglycan

cross-bridges, J. Biol. Chem. 281, 549-558.

36. Srisailam, S., Lukin, J. A., Lemak, A., Yee, A., and Arrowsmith, C. H. (2006) Sequence

specific resonance assignment of a hypothetical protein PA0128 from Pseudomonas

aeruginosa, J Biomol NMR 36 Suppl 1, 27.

37. Korndorfer, I. P., Danzer, J., Schmelcher, M., Zimmer, M., Skerra, A., and Loessner, M.

J. (2006) The crystal structure of the bacteriophage PSA endolysin reveals a unique fold

responsible for specific recognition of Listeria cell walls, J. Mol. Biol. 364, 678-689.

38. Aramini, J. M., Rossi, P., Huang, Y. J., Zhao, L., Jiang, M., Maglaqui, M., Xiao, R.,

Locke, J., Nair, R., Rost, B., Acton, T. B., Inouye, M., and Montelione, G. T. (2008)

Solution NMR structure of the NlpC/P60 domain of lipoprotein Spr from Escherichia

coli: structural evidence for a novel cysteine peptidase catalytic triad, Biochemistry 47,

9715-9717.

39. Vacheron, M. J., Guinand, M., Francon, A., and Michel, G. (1979) [Characterisation of a

new endopeptidase from sporulating Bacillus sphaericus which is specific for the

gamma-D-glutamyl-L-lysine and -D-glutamyl-(L)meso-diaminopimelate linkages of

peptidoglycan substrates (author's transl)], Eur. J. Biochem. 100, 189-196.

40. Bateman, A., and Rawlings, N. D. (2003) The CHAP domain: a large family of amidases

including GSP amidase and peptidoglycan hydrolases, Trends Biochem. Sci. 28, 234-237.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 29: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 29

41. Rigden, D. J., Jedrzejas, M. J., and Galperin, M. Y. (2003) Amidase domains from

bacterial and phage autolysins define a family of -D,L-glutamate-specific

amidohydrolases, Trends Biochem. Sci. 28, 230-234.

42. Nasertorabi, F., Tars, K., Becherer, K., Kodandapani, R., Liljas, L., Vuori, K., and Ely,

K. R. (2006) Molecular basis for regulation of Src by the docking protein p130Cas, J.

Mol. Recognit. 19, 30-38.

43. Pai, C. H., Chiang, B. Y., Ko, T. P., Chou, C. C., Chong, C. M., Yen, F. J., Chen, S.,

Coward, J. K., Wang, A. H., and Lin, C. H. (2006) Dual binding sites for translocation

catalysis by Escherichia coli glutathionylspermidine synthetase, EMBO J. 25, 5970-5982.

Legends

Figure 1. Proposed metabolic pathways for murein peptides in model bacteria E. coli and B.

subtilis. Catalytic enzymes are highlighted in red.

Figure 2. Genomic context of ykfB and ykfC genes in B. subtilis and B. cereus.

Figure 3. Crystal structure of YkfC from B. cereus in complex with L-Ala--D-Glu. (A)

Ribbons representation of BcYkfC. The three domains (SH3b1, SH3b2 and NlpC/P60) are

colored as blue, green and red respectively. The bound L-Ala--D-Glu is shown as sticks. (B)

Ribbons representations of individual domains with secondary structures labeled. (C) Molecular

surface of YkfC colored by sequence conservation. The most conserved residues are shown in

red, the non-conserved residues in white.

Figure 4. Active site and recognition of L-Ala--D-Glu by BcYkfC. (A) 2Fo-Fc density of L-

Ala--D-Glu identified in the active site, contoured at 1.5. (B) The S binding sites of BcYkfC.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 30: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 30

Hydrogen bonds and distances are shown as dashed lines. (C) L-Ala--D-Glu (stick) is located

in the active site defined by SH3b1 domain (blue) and the NlpC/P60 domain (red). (D)

Interaction between L-Ala--D-Glu and the active site of YkfC. This figure was generated using

the program MOE 2008.10 (Chemical Computing Group Inc.).

Figure 5. Sequence alignment of YkfC from B. subtilis and B. cereus, Dipeptidyl-peptidase VI

(DPP VI) from Bacillus sphaericus, and a -D-Glutamyl-L-Diamino acid endopeptidase from

Anabaena variabilis. The sequence numbering and secondary structures of YkfC from B. cereus

are shown at the top. The active site residues are marked at the bottom.

Figure 6. Models of substrate recognition by BcYkfC. (A) L-Ala--D-Glu-DAP was docked

into the active site of BcYkfC. The protein surface is colored according to a gradient in

electrostatic potential (red: negative, blue: positive) (MOE 2008.10, Chemical Computing Group

Inc.). (B) Recognition of -D-Glu by four residues of YkfC (Tyr226, Trp228, Asp237, Ser239

and Ser257). (C) Sequence conservation of the active sites in NlpC/P60 domains, based on 2277

NlpC/P60 domains with intact catalytic dyad and conserved Tyr226. (D) Sequence conservation

of the active sites in the YkfC subfamily of NlpC/P60 enzymes based on 282 sequences selected

based on the presence of a conserved aspartate residue at position 256 of BcYkfC. The

conservation of this residue is highly correlated with a conserved tyrosine in the D region of the

SH3b domain (Tyr118 of BcYkfC).

Figure 7. Structural comparisons between BcYkfC and the cyanobacterial NlpC/P60

endopeptidase AvPCP. (A) BcYkfC and the cyanobacteria endopeptidase AvPCP (PDB 2hbw)

shown in the same orientations. The equivalent residues are shown in red. (B) A stereoview of

the C traces of two proteins shown in A (coloring the same as in A). (C) The S binding sites of

the two proteins are nearly identical. The equivalent residues for the cyanobacteria

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2

3

Page 31: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 31

endopeptidase are labeled in parenthesis. (D) Comparison of the active site cavities and their

environments. The catalytic cysteine is shown as white. A stick model of a docked murein

tripeptide is shown in the active site of BcYkfC.

Figure 8. UPGMA circular phylogram of the YkfC subfamily of NlpC/P60 enzymes. The

bacteria from bacteroidetes, cyanobacteria, firmicute and proteobacteria (-proteobacteria) are

shown in blue, cyan, green and red respectively.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

2

3

Page 32: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 32

Tables

Table 1. Data collection and refinement statistics (PDB code 3h41)

Space group C2Unit Cell a=95.2 Å, b=59.4 Å , c=61.3 Å, =103.3°Data collection 1 MADSe - peak 2 MADSe - remo 3 MADSe - inflWavelength (Å) 0.9786 0.9184 0.9799Resolution range (Å) 41.5-1.79 41.5-1.84 41.5-1.86Number of observations 116,822 107,729 104,295Number of unique reflections 31,085 28,629 27,652Completeness (%) 98.0 (95.1)a 98.6 (98.1) 98.5 (97.9)Mean I/σ(I) 11.3 (2.1)a 12.5 (2.8) 13.1 (2.8)Rmerge on I (%) 11.1 (71)a 10.0 (54) 10.0 (56)Highest resolution shell (Å) 1.88-1.79 1.94-1.84 1.96-1.86MAD PhasingResolution 41.5-1.79Number of Se sites 5Mean Figure of merit 0.36Model and refinement statisticsResolution range (Å) 50-1.79 Data set used in refinement 1 MADSeNo. of reflections (total) 31,083 Cutoff criteria |F|>0No. of reflections (test) 1,564 Rcryst 16.3Completeness (% total) 98.0 Rfree 19.7Stereochemical parametersRestraints (RMS observed) MolProbity statistics Bond lengths (Å) 0.015 All-atom clash score 5.07 Bond angles (°) 1.50 Ramachandran favored (%) 97.7Average isotropic B-value (Å2) 19.9 Ramachandran outliers 1ESU based on Rfree (Å) 0.11 Rotamer outliers 1Protein residues / atoms 305/2,463aHighest resolution shell in parentheses. The high resolution cutoff was chosen such that the mean I/(I) in the highest resolution shell is around 2.ESU = Estimated overall coordinate error (32).Rmerge=hkli|Ii(hkl)-<I(hkl)>|/hkliIi(hkl)Rcryst = hkl||Fobs|-|Fcalc||/hkl|Fobs| where Fcalc and Fobs are the calculated and observed structure factor amplitudes, respectively.Rfree = as for Rcryst, but for 5.0% of the total reflections chosen at random and omitted from refinement.

Structure of endopeptidase YkfC

1

1

2

3456789

10

2

3

Page 33: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 33

Table 2. Structural comparisons of BcYkfC and bacteria proteins with at least one common

domaina

Aligned Substructure

PDB code

Ref. RMSD (Å)

Aligned length (aa)

No. residues (aa)

Z score

Seq. id

(%)

Comments

SH3b+NlpC/P602fg0 (13) 2.1 193 221 21.3 26 Cyanobacterial NpPCP2hbw (13) 2.3 193 220 21.2 29 Cyanobacterial AvPCP

SH3b1m9s (34) 1.9 60 523 6.0 12 TDb/Internalin B1r77 (35) 2.3 57 103 4.7 12 TD/ PG hydrolase ALE-11xov (37) 3.0 51 317 3.5 10 TD/Endolysin PlyPSA2akk (36) 1.8 44 74 2.7 7 PhnA-like protein1x27 (42) 1.9 54 164 5.7 13 Eukaryotic SH3

NlpC/P602jyx (38) 1.9 119 129 17.8 29 E. coli lipoprotein Spr2p1g 2.5 110 230 9.6 18 Uncharacterized2ioa (43) 3.2 106 589 7.3 9 CHAP domain / GspSb

aThe alignment is performed by the DALI (33) structural comparison server using full-length

BcYkfC and individual domains (SH3b1 and NlpC/P60) as search probes. No. residues is the

number of residues present in the model used for comparison. For proteins with multiple SH3-

like domains (PDB codes 1m9s and 1xov) or multiple chains, only the best match is shown.

bGspS: Glutathionylspermidine synthetase/amidase; TD: targeting domain.

Structure of endopeptidase YkfC

1

1

2

3

4

5

6

7

8

2

3

Page 34: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 34

TOC figure:

Structure of endopeptidase YkfC

1

1

2

2

3

Page 35: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 35

Figure 1.

Structure of endopeptidase YkfC

1

1

2

2

3

Page 36: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 36

Figure 2.

Structure of endopeptidase YkfC

1

1

2

2

3

Page 37: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 37

Figure 3.

Structure of endopeptidase YkfC

1

1

2

2

3

Page 38: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 38

Figure 4.

Structure of endopeptidase YkfC

1

1

2

3

2

3

Page 39: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 39

Figure 5.

Structure of endopeptidase YkfC

1

1

2

2

3

Page 40: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 40

Figure 6.

Structure of endopeptidase YkfC

1

1

2

2

3

Page 41: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 41

Figure 7.

Structure of endopeptidase YkfC

1

1

2

2

3

Page 42: Biochemistry - Macromolecular Crystallographysmb.slac.stanford.edu/~ash/ActaF_ligands/chap2-v-4.doc · Web viewDipeptidyl-peptidase VI of Bacillus sphaericus and YkfC of Bacillus

Page 42

Figure 8.

Structure of endopeptidase YkfC

1

1

2

3

2

3