(nas colloquium) proteolytic processing and physiological regulation

108

Upload: others

Post on 11-Sep-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: (NAS Colloquium) Proteolytic Processing and Physiological Regulation
Page 2: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

COLLOQUIUM ON PROTEOLYTICPROCESSING AND

PHYSIOLOGICAL REGULATION

NATIONAL ACADEMY OF SCIENCESWASHINGTON, D.C. 1999

i

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 3: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

NATIONAL ACADEMY OF SCIENCES

Colloquium Series

In 1991, the National Academy of Sciences inaugurated a series of scientific colloquia, five or six of which are scheduled eachyear under the guidance of the NAS Council’s Committee on Scientific Programs. Each colloquium addresses a scientific topic ofbroad and topical interest, cutting across two or more of the traditional disciplines. Typically two days long, colloquia are internationalin scope and bring together leading scientists in the field. Papers from colloquia are published in the Proceedings of the NationalAcademy of Sciences (PNAS).

ii

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 4: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Proteolytic Processing and Physiological Regulation

A COLLOQUIUM SPONSORED BY THE NATIONAL ACADEMY OF SCIENCES

FEBRUARY 20–21, 1999

Saturday, February 20, 1999

Hans Neurath, University of Washington

Welcome and introduction: Proteolytic enzymes, past and future

David Agard, University of California, San Francisco

Kinetic stability and folding of proteases: twin paradigms for protease pro regions

Michael James, University of Alberta

Structural basis and mechanism of zymogen activation

David Matthews, Agouron Pharmaceuticals, Inc.

Structure-assisted design of mechanism based irreversible inhibitors of human rhinovirus 3C protease with potent antiviralactivity against multiple rhinovirus serotypes

Christopher Walsh, Harvard University

Role of D, D-Peptidase in Vancomycin Resistance

Earl Davie, University of Washington

Introduction to Protease activated receptors

Shaun Coughlin, University of California, San Francisco

Thrombin signaling: Molecular mechanisms and roles in vivo

Vishva Dixit, Genentech, Inc.

Identification of components of the cell death pathway

Wolfram Bode, Max-Planck-Institute for Biochemistry

Structure of tryptase, a cage-like serine proteinase involved in asthma, allergic and inflammatory disorders

Philip Beachy, Johns Hopkins University

Hedgehog protein biogenesis and signaling

Marc Kirschner, Harvard University

The role of proteases in the regulation of cell cycle

PROTEOLYTIC PROCESSING AND PHYSIOLOGICAL REGULATION iii

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 5: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Sunday, February 21, 1999

C.S.Craik, University of California, San Francisco

Introduction

Arthur Horwich, Yale University

Chaperone Rings in Protein Folding and Degradation

Robert Huber, Max-Planck-Institute for Biochemistry

Structure of the archaeal and yeast 20S proteasomes and of the eubacterial Analog HslV

Sukanto Sinha, Athena Neurosciences

Cellular mechanism of beta amyloid production and secretion

Michael Brown, University of Texas Southwestern Medical Center

A proteolytic system that controls cholesterol metabolism

Michael Brown

Introduction

Charles Craik, University of California, San Francisco

Reverse biochemistry-using protease inhibitors to dissect complex biochemical processes

Christine Debouck, Smith-Kline and Beecham Pharmaceuticals

From genomics to drugs—cathepsin K and osteoporosis

James McKerrow, University of California, San Francisco

Parasite proteases—windows on molecular evolution and targets for drug design

Joshua Boger, Vertex Pharmaceuticals

Recognizing a drug

PROTEOLYTIC PROCESSING AND PHYSIOLOGICAL REGULATION iv

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 6: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

PROCEEDINGS OF THE NATIONAL ACADEMY OFSCIENCES OF THE UNITED STATES OF AMERICA

Table of Contents

Papers from a National Academy of Sciences Colloquium on Proteolytic Processing and Physiological Regulation

Proteolytic enzymes, past and futureHans Neurath

10962–10963

Caspase activation: The induced-proximity modelGuy S.Salvesen and Vishva M.Dixit

10964–10967

Structural aspects of activation pathways of aspartic protease zymogens and viral 3C pro-tease precursorsAmir R.Khan, Nina Khazanovich-Bernstein, Ernst M.Bergmann, and Michael N.G.James

10968–10975

The catalytic sites of 20S proteasomes and their role in subunit maturation: A mutational andcrystallographic studyMichael Groll, Wolfgang Heinemeyer, Sibylle Jäger, Tobias Ullrich, Matthias Bochtler,Dieter H.Wolf, and Robert Huber

10976–10983

The structure of the human βII-tryptase tetramer: Fo(u)r better or worseChristian P.Sommerhoff, Wolfram Bode, Pedro J.B.Pereira, Milton T.Stubbs, JörgStürzebecher, Gerd P.Piechottka, Gabriele Matschiner, and Andreas Bergner

10984–10991

Sonic hedgehog protein signals not as a hydrolytic enzyme but as an apparent ligand forPatchedNaoyuki Fuse, Tapan Maiti, Baolin Wang, Jeffery A.Porter, Traci M.Tanaka Hall, DanielJ.Leahy, and Philip A.Beachy

10992–10999

Structure-assisted design of mechanism-based irreversible inhibitors of human rhinovirus 3Cprotease with potent antiviral activity against multiple rhinovirus serotypesD.A.Matthews, P.S.Dragovich, S.E.Webber, S.A.Fuhrman, A.K.Patick, L.S.Zalman,T.F.Hendrickson, R.A.Love, T.J.Prins, J.T.Marakovits, R.Zhou, J.Tikhe, C.E.Ford,J.W.Meador, R.A.Ferre, E.L.Brown, S.L.Binford, M.A.Brothers, D.M.DeLisle, andS.T.Worland

11000–11007

Kinetic stability as a mechanism for protease longevityErin L.Cunningham, Sheila S.Jaswal, Julie L.Sohl, and David A.Agard

11008–11014

Cysteine protease inhibitors as chemotherapy: Lessons from a parasite targetPaul M.Selzer, Sabine Pingel, Ivy Hsieh, Bernhard Ugele, Victor J.Chan, Juan C.Engel,Matthew Bogyo, David G.Russell, Judy A.Sakanari, and James H.McKerrow

11015–11022

TABLE OF CONTENTS v

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 7: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

How the protease thrombin talks to cellsShaun R.Coughlin

11023–11027

VanX, a bacterial D-alanyl-D-alanine dipeptidase: Resistance, immunity, or survival function?Ivan A.D.Lessard and Christopher T.Walsh

11028–11032

Chaperone rings in protein folding and degradationArthur L.Horwich, Eilika U.Weber-Ban, and Daniel Finley

11033–11040

A proteolytic pathway that controls the cholesterol content of membranes, cells, and bloodMichael S.Brown and Joseph L.Goldstein

11041–11048

Cellular mechanisms of β-amyloid production and secretionSukanto Sinha and Ivan Lieberburg

11049–11053

Reverse biochemistry: Use of macromolecular protease inhibitors to dissect complex biolog-ical processes and identify a membrane-type serine protease in epithelial cancer andnormal tissueToshihiko Takeuchi, Marc A.Shuman, and Charles S.Craik

11054–11061

TABLE OF CONTENTS vi

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 8: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

National Academy of Sciences Colloquia

BOUND REPRINTS AVAILABLE

In 1991, the National Academy of Sciences (NAS) inaugurated a series of scientific colloquia, several of which are held each yearunder the auspices of the NAS Coun cil Committee on Scientific Programs. These colloquia address scientific topics of broad andtopical interest that cut across two or more traditional disciplines. Typically two days long, these colloquia are international in scopeand bring together leading scientists in the field.

Papers presented at these colloquia are published in the Proceedings of the National Academy of Sciences (PNAS) and areavailable online (www.pnas.org). Because they have generated much interest, these papers are now available in the form of collectedbound reprints, which may be ordered through the National Academy Press.

Currently available are:Carbon Dioxide and Climate Change ($11)Held November 13–15, 1995 (Irvine, CA)Computational Biomolecular Science ($16)Held September 12–13, 1997 (Irvine, CA)Earthquake Prediction ($16)Held February 10–11, 1995 (Irvine, CA)Elliptic Curves and Modular Forms ($7)Held March 15–17, 1996 (Washington, DC)Genetic Engineering of Viruses and Viral Vectors ($21)Held June 9–11, 1996 (Irvine, CA)Genetics and the Origin of Species ($8)Held January 31-February 1, 1997 (Irvine, CA)

Geology, Mineralogy, and Human Welfare ($11)Held November 8–9, 1998 (Irvine, CA)

Neurobiology of Pain ($8)Held December 11–13, 1998 (Irvine, CA)Neuroimaging of Human Brain Function ($17)Held May 29–31, 1997 (Irvine, CA)

Plants and Population: Is There Time? ($8)Held December 5–6, 1998 (Irvine, CA)Protecting Our Food Supply: The Value ofPlant Genome Initiatives ($13)Held May 29–31, 1997 (Irvine, CA)Science, Technology, and the Economy ($12)Held November 20–22, 1995 (Irvine, CA)The Age of the Universe, Dark Matter, and Structure Formation ($13)Held March 21–23, 1997 (Irvine, CA)

Papers from future colloquia will be available for purchase after they appear in PNAS.Shipping and Handling Charges:In the U.S. and Canada please add $4.50 for the first reprint ordered and $0.95 for each additional reprint.Ordering Information:Telephone orders will be accepted only when charged to VISA, MasterCard, or American Express accounts.To order, call toll-free 1–800–624–6242 or order online at www.nap.edu and receive a 20% discount.

NATIONAL ACADEMY OF SCIENCES COLLOQUIA vii

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 9: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Proteolytic enzymes, past and future

This paper is the introduction to the following papers, which were presented at the National Academy of Sciences colloquium“Proteolytic Processing and Physiological Regulation,” held February 20–21, 1999, at the Arnold and Mabel Beckman Center inIrvine, CA.

HANS NEURATH*Department of Biochemistry, Box 357350, University of Washington, Seattle, WA 98195ABSTRACT Today’s knowledge is based on yesterday’s research, which, for me, started some 60 years ago. In the

introduction to this colloquium, the past history of proteolytic enzymes is briefly reviewed against the background ofsimultaneously developing concepts and methodologies in protein chemistry. This history is followed by a sketch of more recent developments of the role of proteolytic enzymes in physiological regulation and an outlook of future trends apparent from current research.

The history of proteolytic enzymes is intimately interwoven with that of protein chemistry. In the very early days, proteolyticenzymes were considered an impediment that had to be removed in the isolation of proteins generally. When I entered the field some60 years ago, Northrop, Kunitz, and Herriott (1) had published the first edition of their treatise Crystalline Enzymes and demonstratedthat, contrary to some prevailing notions, the crystalline proteolytic enzymes and protease inhibitors that they had isolated werechemical entities of constant solubility and hence obeyed the thermodynamic criteria of pure compounds. These compounds includedpepsinogen, pepsin, and pepsin inhibitor, chymotrypsin, trypsin, their zymogens and inhibitors, carboxypeptidase, ribonuclease,hexokinase, diphtheria antitoxin, and a few others. Because these proteins were commercially unavailable, anyone interested instudying them had to isolate them the hard way. The field lay relatively dormant and awaited the development of more effective andspecific methods of isolation, purification, and characterization of proteins, which came some 20 years later, including the methods ofchromatography, gel electrophoresis, gel filtration, ultracentrifugation, amino acid analysis, and protein sequencing (2). In an effort toavoid the complexity of protein substrates, low molecular-weight synthetic peptides and their ester analogs were synthesized and foundto simulate the specificity requirements of these proteases. Other landmarks included the discovery of natural and synthetic proteaseinhibitors such as disopropylfluoro phosphate, which introduced an organic phosphate label into the active site of serine proteases.Chemical characterization of active sites together with x-ray structure analysis of proteases showed that they can be grouped intofamilies of common mechanism, similar structural features, and hence common evolutionary origin. They included the well knownfamilies of serine, cysteine, aspartic, and metallo endo- and exopeptidases.

The number of proteases under investigation in the early days is minuscule compared with the current inventory of severalthousand proteolytic enzymes that are coded by 2% of the structural gene pool (3). Interest in proteases was considerably stimulated bythe recognition that, aside from their digestive action, proteases are involved in the regulation of a great many physiological processes.In many cases, regulation is mediated by the association of proteases with nonproteolytic domains that confer specificity to theirinteraction with receptor sites. The most studied among them are the proteases involved in blood coagulation, fibrinolysis, thecomplement system, and the processing of protein hormone precursors by specific convertases. A telling case of such an association isenterokinase, a protease that fulfills the simple but specific task of cleaving the amino-terminal hexapeptide during the activation oftrypsinogen. Although enterokinase was discovered more than 50 years ago, it was only recently that its x-ray structure was elucidatedby cloning and expressing the heavy chain (4). Surprisingly, it was found to be composed of a trypsin-like catalytic domain covalentlybound to a series of nonprotease domains that also exist in unrelated proteins. One of these resembles the low density lipoproteinreceptor, another resembles meprin, a third occurs in complement C1r, and yet another occurs in a macrophage receptor. The functionalsignificance of these specific combinations is unknown.

The term “limited proteolysis” was coined by Linderstrom-Lang to differentiate the restricted specificity of certain enzymes undercertain conditions from the random proteolysis accompanying protein degradation. Proteolytic processing can be limited by thespecificity of the protease, the accessibility of the susceptible peptide bond of the substrate, the obligatory activation of an enzymeprecursor, the action of protease inhibitors, or a combination of these factors.

By far the best characterized and perhaps most versatile proteolytic enzymes are the serine proteases. Together with theirinhibitors, they regulate a great variety of physiological events. Whereas initially the different specificities of trypsin and chymotrypsinwere exclusively ascribed to differences in the sequence and structure of the primary substrate-binding site (aspartic acid in trypsin vs.serine in chymotrypsin), this simple explanation had to be abandoned when Craik and coworkers (5) demonstrated that, in addition, twosurface loops are changed, indicating that conformational changes at distant secondary binding sites are also required. It has also beenshown that the introduction of a metal binding site by site-directed mutagenesis allows the interconversion of a protease belonging tothe serine family into another that can be regulated like a zinc metallo protease (6, 7). However, the metal inhibits the serine proteasebut is essential for metalloprotease activity. A relative newcomer in the families of proteases are the caspases, which resemble eachother in amino acid sequence, structure, and substrate specificity, as will be discussed in a paper to follow [G.S.Salvesen and V.M.Dixit(8)]. Another important recent advance is the isolation and characterization of proteasomes [R.Huber (9)].

One of the earliest and best understood cases of proteolytic processing is zymogen activation. It underlies a great variety ofphysiological regulations, particularly when coupled to consecutive activation reactions as in the cascades of blood

*To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.

PROTEOLYTIC ENZYMES, PAST AND FUTURE 10962

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 10: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

coagulation, fibrinolysis, the complement reaction, and others. The key point here is that a signal can be specifically and irreversiblyamplified every time a downstream inactive enzyme precursor is activated. Recent work, to be presented in the paper to follow (8), hasdemonstrated a specific role of the pro segment of activation, which early on was regarded a throwaway piece but in certain cases canact as an intramolecular inhibitor and as an intramolecular chaperon that assures proper folding of the active enzyme (10).

Certain generalizations have emerged from these and related investigations. If I may borrow a page from Nancy Thornberry (11),most proteases are synthesized as inactive precursors (zymogens) that require limited proteolysis for activation. Because proteolysis isirreversible under physiological conditions, the generation of the uncleaved precursor requires de novo synthesis. All active proteases,including those that activate zymogens, are regulated by specific inhibitors. However, some protease precursors can regulate their ownactivation, e.g., trypsinogen, whereas others, e.g., plasminogen, do not require peptide bond cleavage for their activation. Proteolyticprocessing, like all proteolytic reactions, requires unique combinations of primary, secondary, and tertiary structures to permitinteraction with substrate so as to form the reactive enzyme-substrate intermediate.

Let me now make a major leap in time and discuss in brief how we reached the current era of research on proteolytic enzymes andwhat we can expect in the millennium that we are about to enter. Two major factors have expanded our conceptual horizons andendowed us with experimental tools of previously unimaginable powers of resolution. One factor is the application of the newlyemerging concepts and methodologies of molecular and cell biology, such as DNA cloning and sequencing, site-directed mutagenesis,gene amplification, gene knockouts, phage display, and the wealth of information yielded by genomics research generally. The othermajor impetus came from a group of newly developed concepts and experimental approaches to the structure and function of proteinsby mass spectroscopy (12), multidimensional NMR, and the use of computers for the prediction of protein structure based on varioustypes of algorithms. To these one might add the methods of combinatorial chemistry as applied to proteins to scan and identify proteinligands of physiological significance. Although we are still far from understanding the rules of the in vivo folding of nascentpolypeptide chains, the challenge lies in deriving the function of a protein from its known chemical and biological parameters and inlearning how to design proteins of predetermined physiological properties. All of these developments, singly and in combination,expand our horizons and the goals that we are setting for their application to biology and medicine.

The importance of proteolytic enzymes to the understanding of vital biological tasks is perhaps best illustrated by current trends inthe study of viral proteases (13). In every known instance, the timing, placement, and mode of action of the virus encoded protease aresomehow adapted to the conditions under which it operates within the viral environment. Two examples follow: in herpes viruses suchas cytomegalovirus, the structure of the protease reveals a catalytic triad of His/His/Ser instead of the conventional Asp/His/Ser of themammalian serine proteases and a single beta barrel structure per monomer instead of two in the mammalian serine proteases (13).Analogously, in adeno viruses the cysteine protease contains a Glu/His/Cys catalytic triad characteristic of cysteine proteases, but theseven alpha helices and a single five-stranded beta sheet are not seen in the parent protease (papain). In either case, the examples givendemonstrate the ability of the virus proteases to adapt themselves to the evolution of functions within the limits of compatible proteinstructures (13).

Other rapidly expanding areas of biological research involving well known proteases include those of apoptosis, the mediation ofthrombin signaling by protease activated receptors, proteolytic processing in cholesterol metabolism, in the cell cycle, and the manyothers included in this issue of the Proceedings.

It is no coincidence that industry and academia are almost equally represented in this audience, because intense cooperationbetween both is essential if we are to reap the full benefits of the advances and discoveries in both basic and applied research.1. Northrop, J.H., Kunitz, M. & Herriott, R.M. (1938) Crystalline Enzymes (Columbia Univ. Press, New York).2. Neurath, H. (1995) Protein Sci. 4, 1939–1943.3. Barrett, A.J., Rawlings, N.D. & Woessner, J.F. (1998) in Handbook of Proteolytic Enzymes (Academic, New York), pp. xiii–xxix.4. Kitamoto, Y., Yuan, X., Wu, Q., McCourt, D.W. & Sadler, J.E. (1994) Proc. Natl. Acad. Sci. USA 91, 7588–7592.5. Perona, J.J. & Craik, C.S. (1995) Protein Sci. 4, 337–360.6. Higaki, J.N., Fletterick, R.J. & Craik, C.S. (1992) Trends Biochem. Sci. 17, 100–104.7. Klemba, M., Gardner, K.H., Marino, S., Clarke, N.D. & Regan, L. (1995) Struct. Biol. 2, 368–373.8. Salvesen, G.S. & Dixit, V.M. (1999) Proc. Natl. Acad. Sci. USA 96, 10964–10967.9. Groll, M., Heinemeyer, W., Jäger, S., Ullrich, T., Bochtler, M., Wolf, D.H. & Huber, R. (1999) Proc. Natl. Acad. Sci. USA 96, 10976–10983.10. Cunningham, E.L., Jaswal, S.S. & Agard, D.A. (1999) Proc. Natl. Acad. Sci. USA 96, 11008–11014.11. Thornberry, N.A. & Lazebnik, Y. (1998) Science 281, 1312–1316.12. Cohen, S.L. (1996) Structure (London) 4, 1013–1016.13. Babé, L.M. & Craik, C.S. (1997) Cell 91, 427–430.

PROTEOLYTIC ENZYMES, PAST AND FUTURE 10963

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 11: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Caspase activation: The induced-proximity model

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

GUY S. SALVESEN*† AND VlSHVA M. DlXIT‡

*Programs in Cell Death and Aging Research, Burnham Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037; and‡Department of Molecular Oncology, Genentech Inc., 460 Point San Bruno Boulevard, South San Francisco, CA 94080

ABSTRACT Members of the caspase family of proteases transmit the events that lead to apoptosis of animal cells. Distinctmembers of the family are involved in both the initiation and execution phases of cell death, with the initiator caspases beingrecruited to multicomponent signaling complexes. Initiation of apoptotic events depends on the ability of the signalingcomplexes to generate an active protease. The mechanism of activation of the caspases that constitute the different apoptosis-signaling complexes can be explained by an unusual property of the caspase zymogens to autoprocess to an active form. Thisautoprocessing depends on intrinsic activity that resides in the zymogens of the initiator caspases. We review evidence for ahypothesis—the induced-proximity model—that describes how the first proteolytic signal is produced after adapter-mediatedclustering of initiator caspase zymogens.

Apoptosis is a mechanism that regulates cell number and is vital throughout the life of all animals. Though several different typesof biochemical events have been recognized as important in apoptosis, perhaps the most fundamental is the participation of members ofa family of cysteine-dependent, Asp-specific proteases known as the caspases (1–3). Caspases cleave a number of cellular proteins, andthe process is one of limited proteolysis in which a small number of cuts, usually only one, are made in interdomain regions.Sometimes cleavage results in activation of the protein, sometimes in inactivation, but never in degradation, because their substratespecificity distinguishes the caspases as among the most restricted of endopeptidases.

Singularly important in this context is that caspase zymogens are themselves substrates for caspases, such that some are able toactivate others in a hierarchical relationship (Fig. 1). Thus, pathways exist to transmit signals via sequential caspase activations, andthis event has been most extensively examined in apoptosis. It is relatively easy to imagine that the caspases operating at the bottom ofthe pathway are activated by the ones above. Until recently, the questions of how the first caspase in a pathway became activated andhow the first death signal was generated were perplexing issues. Now, several groups have focused on this issue (4–7) and have arrivedat a consensus to describe the intriguing operation of the initiation of the proteolytic pathways that execute apoptosis. Though the basichypothesis is supported, many issues remain to be explained, not the least of which is the nature of the mechanism that governs theprocess. This paper reviews the support for the hypothesis—the induced-proximity model—and its current limitations.

Apoptosis Triggered by Death Receptors. One of the most intensively studied pathways to cell death results from ligation oftransmembrane death receptors belonging to the tumor necrosis factor-R1 (TNF-R1) family. After engagement by specific ligands,these receptors transmit a lethal signal that results in classic apoptotic cell death (8, 9). Because simple transfection of death receptors isusually sufficient to sensitize cells to a death ligand, it follows that the components required to transduce this signal reside in manycells. Thus TNF-R1 family members serve as a conduit for the transfer of death signals into the cell’s interior after interaction withtheir extracellular cognate ligands. The TNF-R1/TNF pair itself presents a rather complex pathway with which to dissect apoptosisinitiation, because this receptor/ligand pair can signal either apoptosis or an antagonistic NF- B-mediated survival pathway, dependingon the cellular context. The TNF-R1 homologue Fas (CD95/Apo-1) has been the paradigm of choice, because addition of its cognateligand, FasL, or even receptor agonist antibodies rapidly signals cell death (10).

Because agonist Fas antibodies can trigger apoptosis, it was possible to use them to isolate the components of the death-inducingsignaling complex (DISC) that forms after Fas ligation (4, 11). A combination of yeast two-hybrid and proteinsequence analysisrevealed a seemingly simple DISC, comprising Fas itself, the adapter molecule FADD, and caspase-8 (Fig. 1). This discovery revealeda potential solution to the perplexing problem of how the first proteolytic signal was generated during apoptosis, because it implicated acaspase directly in the triggering event. Before this work, receptors were thought to signal either by altering the phosphorylation statusof key signaling molecules or by functioning as ion channels. Death receptors, such as Fas, signal by direct recruitment and activationof a protease (caspase-8). How exactly does the recruited zymogen become active? To understand this process as a basis forformulating an adequate hypothesis, one must understand the unusual properties of caspase zymogens that set them apart from mostother proteases. Because, unlike most other proteases, simple expression of caspase zymogens in Escherichia coli usually results intheir activation (12, 13). This activation results from processing that is a consequence of intrinsic proteolytic activity residing in thecaspase zymogens. It is not caused by E.coli proteases, as indicated by the fact that catalytically disabled C285A (caspase-1 numberingconvention) mutants fail to undergo processing.

Self-Processing of Caspase Zymogens. In common with other protease zymogens (14), with notable exceptions (see Table 1),generation of an active caspase usually requires limited proteolysis (Fig. 2). The activating cleavage takes place within a short segmentthat, in the zymogen, connects the large and small subunits of the catalytic domain with both subunits containing essential componentsof the catalytic machinery. The location of cleavage within this segment need not be precise in vitro (15); nevertheless, the highlyconserved Asp-

†To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviations: TNF, tumor necrosis factor; DISC, death-inducing signaling complex; DED, death-effector domain.

CASPASE ACTIVATION: THE INDUCED-PROXIMITY MODEL 10964

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 12: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

297 (caspase-1 numbering convention) directs cleavage specificity within this segment in vivo. Proteolytic processing that results inactivation usually occurs at this Asp residue, such that most activated caspases can process their own and other caspase zymogens,given sufficient time and a high enough concentration in vitro (16–18). The extent to which this processing occurs in vivo, however, isregulated by the residues surrounding Asp-297. For example, the sequence surrounding Asp-297 in the downstream executionercaspases 3 and 7 fits the extended substrate specificity of the initiator caspases 8 and 9 remarkably well (19). With the notableexception of at least caspase-2 (20), distinctions in substrate specificity within the caspase family fit closely to the S4–S1 subsitepreferences deduced from synthetic peptidic substrates (19).

FIG. 1. The framework of apoptosis. Death may be signaled by direct ligand-enforced clustering of receptors at the cellsurface, which leads to the activation of the “initiator” caspase-8 (casp-8). This caspase then directly activates the“executioner” caspases 3 and 7 (and possibly 6), which are predominantly responsible for the limited proteolysis thatcharacterizes apoptotic dismantling of the cell. Alternatively, irreparable damage to the genome caused by mutagens,pharmaceuticals that inhibit DNA repair, or ionizing radiation leads to the activation of another initiator, caspase-9 (28). Thelatter event requires the recruitment of pro-caspase-9 to proteins such as Apaf-1, which requires the proapoptotic factorcytochrome c (cyto C) to be released from mitochondria (29). Though other modulators probably regulate the apoptoticpathway in a cell-specific manner (30), this framework is considered common to most mammalian cells.

Table 1. Zymogenicities of some caspases compared with two serine proteases

Protease ZymogenicityCaspase-3 >10,000Caspase-8 100Caspase-9 10Trypsin >10,000tPA 2–10

Zymogenicity is defined as the ratio of the activity of a processed protease to the activity of the zymogen on any given substrate (27). Data for trypsinand tissue plasminogen activator (tPA) are taken from ref. 27. The interesting range of zymogenicity values displayed by members of the caspase familyis mirrored by members of the chymotrypsin family, with trypsin and tPA shown for comparison. Presumably, enzmes such as tPA and caspase-9 havedown played the requirement for proteolysis as a mechanism of substantially increasing their activities, because allosteric regulators substitute thisfunction: fibrin for tPA and Apaf-1 for caspase-9. In the case of tPA, specific side-chain interactions, absent in other members of the chymotrypsinfamily, allow activity of the zymogen. However, in the absence of a molecular structure of the caspase-8 and caspase-9 zymogens, little evidence isavailable to explain the high activity of the unprocessed protein. One clue is suggested by the structure of active caspases 1 and 3, each of which iscomposed of two catalytic units thought to arise from the dimerization of monomeric zymogens (reiewed in ref. 3). If activation of zymogens of theinitiator caspases-8, 9, and CED3 operates by clustering, then the clustering phenomenon may be explained by adapter-driven homodimerization ofmonomers. However, as detailed in Future Directions, the molecular mechanisms are far from clear.

The Induced-Proximity Hypothesis. Interestingly, depending on expression conditions, one can obtain either processed activecaspase or unprocessed zymogen from the same construct, at least for caspases 3, 7, and 9 (15, 21, 22). For example, short inductiontimes (<30 min) yield unprocessed zymogens, but longer ones (>3 hours) yield fully processed enzymes. Significantly, even very shortexpression times and low inducer concentrations have failed to yield caspase-8 zymogens in our studies (G.S. and H.Stennicke,unpublished work). Caspase-8 processes itself extremely rapidly on heterologous expression in E.coli, suggesting that the zymogenmust possess significant intrinsic proteolytic activity, allowing for autoprocessing. These observation are the basis for the induced-proximity hypothesis for the operation of the DISC, the assembly of which forces a locally high concentration of caspase-8 zymogensin a process mediated by recruited FADD (Fig. 3). This clustering of zymogens possessing intrinsic enzymatic activity would allow forprocessing in trans as well as activation of the first protease in the cascade.

The hypothesis would need to be tested by asking whether the zymogen form of caspase-8 possessed reasonable enzymaticactivity. Because such a test could not be made by expressing the wild-type precursor, a nonprocessable mutant was generated byreplacing the two Asp cleavage sites within the large/small subunit linker segment with Ala. These replacements enabled the generationof a “frozen” zymogen that could be obtained in quantity after expression in E. coli. Significantly, the frozen zymogen retained thesame specificity against caspase inhibitors and synthetic substrates but cleaved these substrates at 1% of the rate of an equivalentconcentration of fully processed enzyme. The mechanistic origin of this rate differential is currently unknown, but, significantly, thezymogenicity of caspase-8, the ratio of its activity as a fully active enzyme to the activity of its unprocessed zymogen, was 100 (4). Theimportance of zymogenicity is detailed in Table 1.

Testing the Hypothesis. The in vitro observations on the high zymogenicity of caspase-8 suggested that a test of the inducedautoprocessing hypothesis was mandated, preferably in vivo. With this mandate in mind, we generated a caspase-8 construct in whichthe DED domains of the zymogen were replaced by a myristoylation signal, followed by three tandem repeats of a derivative FK506binding protein (FKBP). The latter had been designed by Schreiber and colleagues (23) to act as an artificial mimic of natural cellularrecruitment processes. Artificial oligomerization of proteins carrying the FKBP domains was induced by treatment with the cellpenetrant FK1012, a dimeric form of FK506. Ectopic expression of the catalytically active chimera was tolerated fairly well by twohuman cell lines, even in the presence of monomeric FK506. However, on addition of dimeric FK1012, the cells underwent apoptosisby a mechanism that depended on the catalytic function of the chimeric caspase-8, because replacing the catalytic Cys by Ser failed toelicit the same effect. This technique, later termed the “artificial death switch” (24), has taken a prominent position in the exploration ofapoptosis initiation.

These data, the in vitro observations on the zymogenicity of pro-caspase-8, and the artificially induced death of cells harboring thechimeric FKBP-caspase-8 are fully consistent with the induced-proximity model. Indeed, since this original description, thepostmitochondrial initiator caspase-9 (7) and the Caenorhabditis elegans caspase CED3 (25) have both been implicated in congruent

CASPASE ACTIVATION: THE INDUCED-PROXIMITY MODEL 10965

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 13: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

proximity activation mechanisms. Is this mechanism a common one for the basis of generating biochemical death signals? Possibly.However, a caveat must be added to the caspase-9 issue, because this caspase has a very low zymogenicity (22); in other words, it isalmost as active before as it is after proteolytic processing! Thus, in the case of caspase-9, an alternative pathway may be used.

FIG. 2. Caspase activation by proteolysis. Caspases are synthesized as single-chain precursors that await activation within thecell. Activation usually proceeds in all caspases by cleavage at the conserved Asp-297 (caspase-1 numbering convention).After this activation, an as-yet undescribed conformational change is thought to occur, bringing the activity and specificitydeterminants (quarter circles in the linear precursor) into the correct alignment for catalysis. Frequently an N-terminal peptideis removed; however, the reason for this removal is obscure, because it is apparently not required for zymogen activation. Inthe example of caspase-8 shown in the figure, the N-peptide (sometimes called the prodomain) contains death-effectordomains (DEDs) required for recruitment to the cytosolic face of death receptors. The crystal structures of caspases 1 and 3reveal a dimer of small and large subunits in the active, processed state, and—it is assumed, though not specificallydemonstrated—that this organization is the case for caspases in solution. The active sites in the putative dimer are shown asopen circles.

If the single-chain zymogens of caspases 8 and 9 are partly active, why are they not dangerous to healthy cells? They should causea slow production of active executioner caspases. This question is most readily explained by the presence of endogenous caspaseinhibitors, members of the IAP (inhibitor of apoptosis protein) family (31). Members of this family inhibit executioner caspases 3 and7, and we propose that they present a barrier to caspase activity that must be exceeded before sufficient execution potential can beachieved. Thus, in the presence of IAPs, a little caspase activation is acceptable, because it would be rapidly saturated by the inhibitors.It is only when a sufficient concentration of activated executioner caspases builds up that apoptosis occurs. In this hypothesis, the IAPsregulate the apoptotic threshold.

FIG. 3. Model for the operation of the DISC. Assembly of the DISC occurs in a hierarchical manner. On ligation of Fas, its“death domain” (white circle) binds to a homologous domain in the adapter FADD, which in turn recruits the zymogen ofcaspase-8 by a homophilic interaction requiring the homologous DEDs (black circles). Immediately after recruitment, thezymogen is processed by an adjacent zymogen, resulting in proteolytic activation and origination of active caspase-8 as theinitiating death signal. Activation is thought to result from cleavage at Asp-297 (caspase-1 numbering convention).Presumably, the active form of caspase-8 (designated as a dimer as seen in the structures of active caspases 1 and 3) releasesitself from the adapter after proteolytic removal of the N-terminal DED, though it is not clear how the endogenous activatedenzyme distributes in the cell.

Future Directions. Notwithstanding the attractiveness of the induced-proximity model, there remain a number of open questions.For example, although the data support the hypothesis, the molecular mechanisms of the event(s) have not been explained, and thereare a number of issues that need to be addressed in the near future. These issues are as follows. (i) Must the processed caspase-8 bereleased from the DISC to diffuse toward its downstream substrates? (ii) Does activation require dimerization, a consensus for thecatalytic form of caspases 1 and 3 at least? (iii) Does processing occur in cis (intramolecular) or in trans (intermolecular)? (iv) Must thezymogens be specifically aligned within the recruitment complex, and how many zymogen molecules constitute an activation locus? (v)Is the minimal operative DISC as simple as the one depicted in Fig. 3, or are other proteins required (26)? These questions cut to theheart of uncertainties surrounding the fundamental activation mechanism of all the caspases, and each is (in principle) answerable bygenerating specific mutants and by using the artificial death-switch technique. Perhaps it is already possible to settle the issue of cisversus trans processing; in our hands, it is rarely possible to observe activation of caspase zymogens in the nanomolar range, but onartificial concentration toward the micromolar range, one observes processing and activation. This observation would imply a second-order reaction, which is most easily understood in terms of trans processing. Indeed, this proposal makes sense, because it is mucheasier to regulate zymogen activation in trans than in cis. The answers to these questions will require the molecular structure of at leastone caspase zymogen (preferably caspase-8). Their resolution will certainly lead to a better understanding of the molecular mechanismof the DISC, with the attendant possibilities of interfering therapeutically to either initiate or prevent the commitment step in death-receptor-mediated apoptosis.

This work was supported by grants from the National Heart, Lung, and Blood Institute, National Institute on Aging, and NationalInstitute of Neurological Disorders and Stroke.

CASPASE ACTIVATION: THE INDUCED-PROXIMITY MODEL 10966

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 14: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

1. Salvesen, G.S. & Dixit, V.M. (1997) Cell 91, 443–446.2. Cohen, G.M. (1997) Biochem. J. 326, 1–16.3. Thornberry, N.A. & Lazebnik, Y. (1998) Science 281, 1312–1316.4. Muzio, M., Stockwell, B.R., Stennicke, H.R., Salvesen, G.S. & Dixit, V.M. (1998) J. Biol. Chem. 273, 2926–2930.5. Martin, D.A., Siegel, R.M., Zheng, L. & Lenardo, M.J. (1998) J. Biol. Chem. 273, 4345–4349.6. Yang, X., Chang, H.Y. & Baltimore, D. (1998) Mol. Cell 1, 319–325.7. Srinivasula, S.M., Ahmad, M., Fernandes-Alnemri, T. & Alnemri, E.S. (1998) Mol. Cell 1, 949–957.8. Ashkenazi, A. & Dixit, V.M. (1998) Science 281, 1305–1308.9. Ware, C.F., Santee, S. & Glass, A. (1998) in The Cytokine Handbook (Academic, London), 3rd Ed., pp. 549–592.10. Nagata, S. & Goldstein, P. (1995) Science 267, 1449–1456.11. Boldin, M.P., Goncharov, T.M., Goltsev, Y.V. & Wallach, D. (1996) Cell 85, 803–815.12. Orth, K., O’Rourke, K., Salvesen, G.S. & Dixit, V.M. (1996) J. Biol Chem. 271, 20977–20980.13. Stennicke, H.R. & Salvesen, G.S. (1997) J. Biol. Chem. 272, 25719–25723.14. Neurath, H. (1989) Trends Biochem. Sci. 14, 268–271.15. Zhou, Q. & Salvesen, G.S. (1997) Biochem. J. 324, 361–364.16. Srinivasula, S.M., Ahmad, M., Fernandes-Alnemri, T., Litwack, G. & Alnemri, E.S. (1996) Proc. Natl. Acad. Sci. USA 93, 14486–14491.17. Muzio, M., Salvesen, G.S. & Dixit, V.M. (1997) J. Biol. Chem. 272, 2952–2956.18. Slee, E.A., Harte, M.T., Kluck, R.M., Wolf, B.B., Casiano, C.A., Newmeyer, D.D., Wang, H.G., Reed, J.C., Nicholson, D.W., Alnemri, E.S., et al.

(1999) J. Cell Biol. 144, 281–292.19. Thornberry, N.A., Rano, T.A., Peterson, E.P., Rasper, D.M., Timkey, T., Garcia-Calvo, M., Houtzager, V.M., Nordstrom, P.A., Roy, S.,

Vaillancourt, J.P., et al. (1997) J. Biol. Chem. 272, 17907–17911.20. Talanian, R.V., Quinlan, C., Trautz, S., Hackett, M.C., Mankovich, J.A., Banach, D., Ghayur, T., Brady, K.D. & Wong, W.W. (1997) J. Biol Chem.

272, 9677–9682.21. Stennicke, H.R., Jurgensmeier, J.M., Shin, H., Deveraux, Q., Wolf, B.B., Yang, X., Zhou, Q., Ellerby, H.M., Ellerby, L.M., Bredesen, D., et al.

(1998) J. Biol. Chem. 273, 27084–27090.22. Stennicke, H.R., Deveraux, Q.L., Humke, E.W., Reed, J.C., Dixit, V.M. & Salvesen, G.S. (1999) J. Biol Chem. 274, 8359–8362.23. Spencer, D.M., Belshaw, P.J., Chen, L., Ho, S.N., Randazzo, F., Crabtree, G.R. & Schreiber, S.L. (1996) Curr. Biol. 6, 839–847.24. MacCorkle, R.A., Freeman, K.W. & Spencer, D.M. (1998) Proc. Natl. Acad. Sci. USA 95, 3655–3660.25. Yang, X., Chang, H.Y. & Baltimore, D. (1998) Science 281, 1355–1357.26. Imai, Y., Kinura, T., Murakami, A., Yajima, N,, Sakamaki, K. & Yonehara, S. (1999) Nature (London) 398, 777–785.27. Tachias, K. & Madison, E.L. (1996) J. Biol. Chem. 271, 28749– 28752.28. Li, P., Nijhawan, D., Budihardjo, I., Srinivasula, S.M., Ahmad, M., Alnemri, E.S. & Wang, X. (1997) Cell 91, 479–489.29. Zou, H., Henzel, W.J., Liu, X., Lutschg, A. & Wang, X. (1997) Cell 90, 405–413.30. Green, D.R. & Reed, J.C. (1998) Science 281, 1309–1312.31. Deveraux, Q., Takahashi, R., Salvesen, G.S. & Reed, J.C. (1997) Nature (London) 388, 300–304.

CASPASE ACTIVATION: THE INDUCED-PROXIMITY MODEL 10967

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 15: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Structural aspects of activation pathways of aspartic proteasezymogens and viral 3C protease precursors

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

AMIR R. KHAN*, NINA KHAZANOVICH-BERNSTEIN, ERNST M. BERGMANN, AND MICHAEL N. G. JAMES†

Medical Research Council Group in Protein Structure and Function, Department of Biochemistry, University of Alberta,Edmonton, Alberta T6G 2H7, Canada

ABSTRACT The three-dimensional structures of the inactive protein precursors (zymogens) of the serine, cysteine, aspartic, and metalloprotease classes of proteolytic enzymes are known. Comparisons of these structures with those of the mature, active proteases reveal that, in general, the preformed, active conformations of the residues involved in catalysis arerendered sterically inaccessible to substrates by the residues of the zymogens’ N-terminal extensions or prosegments. Theprosegments interact in nonsubstrate-like fashions with the residues of the active sites in most of the cases. The gastric asparticproteases have a well-characterized zymogen conversion pathway. Structures of human progastricsin, the inactive intermediate2, and active human pepsin are known and have been used to define the conversion pathway. The structure of the zymogenprecursor of plasmepsin II, the malarial aspartic protease, shows a new twist on the mode of inactivation used by the gastriczymogens. The prosegment of proplasmepsin disrupts the active conformation of the two catalytic aspartic acid residues byinducing a major reorientation of the two domains of the mature protease. The picornaviral 2A and 3C proteases have achymotrypsin-like tertiary structure but with a cysteine nucleophile. These enzymes cleave themselves from the viralpolyprotein in cis (intramolecular cleavage) and carry out trans cleavages of other scissile peptides important for the virus lifecycle. Although the structure of the precursor viral polyprotein is unknown, it probably resembles the organization of theproenzymes of the bacterial serine proteases, subtilisin, and α-lytic protease. Cleavage of the prosegment is known to occur incis for these precursor molecules.

Zymogens of proteolytic enzymes consist of the intact protease with an N-terminal extension. Conversion of the inactive zymogento the mature, active protease requires limited proteolysis usually of a single peptide bond (1). Molecular rearrangements accompanythe proteolytic removal of the prosegment of the zymogen, eventually leading to the mature protease. The prosegments of thezymogens range in size from two residues for some of the granzymes to more than 150 residues for a-lytic protease, a bacterial serineprotease (2).

The conversion of zymogens to the respective active enzymes is achieved by several different mechanisms (3). The active serineproteases of the chymotrypsin family result from limited proteolysis of the zymogens by convertases. For example, the cascade of theblood-clotting enzymes (4) involves the conversion of inactive forms (e.g., prothrombin) to active forms of the enzyme (thrombin) by ahighly specific catalytic cleavage by another of the clotting enzymes (factor Xa). On the other hand, simply changing the pH of thesolution in which the gastric aspartic protease zymogens are dissolved from �6.5 to 3.0 (an increase in [H+] of �3,100-fold) is sufficientto bring about the conversion (5). In a similar fashion, the zymogens of the papain-like cysteine proteases are converted to the activeenzymes in a pH-regulated fashion. The in vitro activation of propapain is consistent with an initial intramolecular cleavage event (6).The conversion of procarboxypeptidase is initiated by trypsin cleavage of the Arg-99p-Ala-1 bond at the prosegment to mature enzymejunction (7). Prostromelysin-1 can be converted to the active form by other proteolytic enzymes, heat, or the presence oforganomercurial agents (8).

There are some generalities regarding zymogen conversion that one can make in light of the three-dimensional structures of boththe zymogens and the respective active enzymes (3). First, the residues that constitute the active sites of the protease portions of thezymogens have virtually identical conformations to those of the mature, active proteases. The major exceptions are the serine proteasesof the chymotrypsin family. The activation process involves the formation of an ion pair between the newly formed N-terminal residueIle-16 NH3

+ and the β-carboxylate of Asp-194 (9), which triggers the conformational changes that form the oxyanion binding pocketand the active conformation of the S1 specificity pocket [the nomenclature of Schechter and Berger (10) is used throughout thismanuscript].

Second, the preformed active sites of the protease portions of zymogens are generally not accessible to substrates because residuesof the prosegments sterically block the approach to the active sites. This statement does not hold for the chymotrypsin-like serineproteases as the active sites of these zymogens are able to bind protein inhibitors that induce conformational changes that form theoxyanion hole in spite of the ion pair involving Asp-194 and Ile-16 being absent (11).

Proteolysis of the portion of the prosegments that interact with the active site residues is prevented in several different ways. Inprostromelysin the prosegment passes through the active site in the reverse polypeptide direction (N→C) relative to substrates ortransition state mimics (12). A reverse orientation of the prosegment blocking the active site in the cysteine protease zymogens also hasbeen observed in the structures of rat procathepsin B (13) and human procathepsin L (14). The region of the prosegment of the gastricaspartic proteases interacting with the catalytic residues Asp-32 and Asp-215 (porcine pepsin numbering) most intimately, includes ahighly conserved lysine at position 36p (the residue numbers of the prosegment are followed by p). The εNH3

+ group of Lys-36p formsan ion pair with each of the two active site carboxylate groups (15).

Conversion of Gastric Aspartic Protease Zymogens

The molecular structures of human progastricsin (16), activation intermediate 2 of human gastricsin (17), and a structural

*Present address: Department of Molecular and Cellular Biology, Harvard University, 7 Divinity Avenue, Cambridge, MA 02138.†To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviation: HPV, human polio virus.

STRUCTURAL ASPECTS OF ACTIVATION PATHWAYS OF ASPARTIC PROTEASE ZYMOGENS AND VIRAL 3C PROTEASEPRECURSORS

10968

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 16: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

homolog to mature, human gastricsin, human pepsin 3 (18) allow one to construct a reasonably detailed view of the pathway followedin the conversion of the inactive zymogen to the active protease. Fig. 1 shows stereo ribbon diagrams of each of these three molecularstructures. Fig. 2 shows a diagrammatic view of the conversion pathway. This pathway is a general pathway for the gastric asparticproteases but the individual enzymes differ in detail.

FIG. 1. Structures on the conversion pathway of the aspartic protease zymogen progastricsin. The structure of humangastricsin is not known; the human pepsin structure therefore has been used as a model for gastricsin. This figure, as well asFigs. 3–6 have been prepared with BOBSCRIPT (19) and RASTER 3D (20). (A) The structure of human progastricsin (16)represented in stereo. The residues of the prosegment (Ala-1p to Leu-43p) are in green, those of the gastricsin portion of thezymogen are in blue except for those regions that undergo large conformational changes, Ser-1 to Ala-13, Phe-71 to Thr-81and Tyr-125 to Ala-136, which are represented in mauve. The promature junction is Leu-43p-Ser-1, the peptide bond cleavedintramolecularly is Phe-26p to Leu-27p. The side chains of Asp-32 and Asp-217 are represented in red. (B) Stereo view of themolecular structure of intermediate 2 on the activation pathway of human gastricsin (17). The color scheme used is the sameas in A. The residues missing on this figure, Leu-22p to Phe-26p and Ser-1, are disordered in the structure, and there is nointerpret able electron density for them on the maps. The water molecule bound between the two carboxyl groups of Asp-32and Asp-217 is shown as a red sphere. The final step in the conversion involves the dissociation of the peptide Ala-1p toPhe-26p from gastricsin with the N-terminal residues of gastricsin, Ser-1 (N-ter) to Ala-13, replacing the N-terminal β-strandof the prosegment. (C) The structure of human pepsin (18) shown as a model of human gastricsin. The regions of gastricsinthat undergo large conformational changes from their positions in progastricsin are shown in pink, and the active siteaspartates with the bound catalytic H2O molecule are colored red. Reproduced with permission from ref. 3.

STRUCTURAL ASPECTS OF ACTIVATION PATHWAYS OF ASPARTIC PROTEASE ZYMOGENS AND VIRAL 3C PROTEASEPRECURSORS

10969

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 17: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

FIG. 2. A diagrammatic representation of the conversion pathway of progastricsin to gastricsin. The prosegment ofprogastricsin (I), A1p to L43p, has three helical segments and a net positive charge forming several ion pair interactions andelectrostatic stabilization with the mature portion of the zymogen. The highly conserved K37p interacts directly with thecatalytic aspartates Asp-32 and Asp-217. Lowering the pH of the solution below 4.0 converts progastricsin into intermediate1 (26) represented by II. Refolding of the prosegment in the vicinity of the active gastricsin brings the first scissile bondPhe-26p-Leu-27p to the exposed active site aspartates (III). Cleavage at the premature junction, Leu-43p-Ser-1, (IV) is likelyan intermolecular cleavage (28) and results in intermediate 2 (V), a molecular species that consists of Ala-1p to Phe-26pnoncovalently associated with mature gastricsin (17). The final step in the conversion results from the dissociation of the N-terminal peptide 1–26 of the prosegment and refolding the residues Ser-1 to Ala-13 to replace the region of the prosegment inthe six-stranded β-sheet of gastricsin (VI).

Progastricsin consists of a single polypeptide chain of 372 aa (21). The N-terminal extension or prosegment is 43 aa in length andcomprises residues Ala-1p to Leu-43p. The prosegment is folded into a compact domain having an initial extended β-strand (Val-3p toLys-8p) followed by three helical segments: Ile-13p to Lys-20p, Leu-22p to Arg-28p, Pro-34p to Arg-39p (16). The third helicalsegment (a 310 helix) packs against the active site residues and the εNH3

+ group of a conserved lysine residue (Lys-37p in progastricsin)forms ion pair interactions with the carboxyl groups of the two catalytic aspartates, Asp-32 and Asp-217. Two tyrosine side chainsTyr-38p and Tyr-9 form symmetric H-bonded interactions with the carboxylates of Asp-217 and Asp-32, respectively, furtherrestricting access to the active site. The phenolic side chain of Tyr-9 occupies the S1 binding pocket; Tyr-38p is in the S1� bindingpocket. The tertiary structure of the prosegment (Leu-1p to Tyr-37p) in porcine pepsinogen (15) is virtually identical to that describedabove for progastricsin (16).

The polypeptide chain from Tyr-38p to Tyr-9 in progastricsin adopts a conformation that is different from the equivalent segmentof chain in the pepsinogens (15, 22). As well, a portion of the polypeptide chain in gastricsin Tyr-125 to Ala-136 (Fig. 1A) is displacedfrom the position that this chain segment occupies in all other aspartic protease zymogens and active enzymes (16).

The trigger for initiating the conversion of the gastric aspartic protease zymogens is a drop in pH (5). At neutral pH, the structuresof the zymogens are stabilized by the electrostatic interactions of the ion pairs and the inactive conformation is maintained (16).However, when the zymogens reach the acid pH (�2.0) of the lumen of the stomach, the carboxylate groups become protonated and therepulsive interactions among the net positive charges of the prosegment destabilize its interactions in the active site of the protease.Kinetic studies in the late 1930s showed that the conversion of porcine pepsinogen into pepsin was an autocatalytic process (5, 23). Inaddition, the fact that the loss of pepsinogen was not accompanied by an equivalent increase in the appearance of pepsin implied thepresence of intermediate species on the pathway (5). Spectroscopic studies of this conversion process established that there areconformational changes (24) in the 5-ms to 2-s time scale (25). Rapidly changing the pH back to neutrality can reverse theseconformational changes.

Biochemical studies of the conversion of human progastricsin to gastricsin showed the presence of at least two intermediates (26).Intermediate 1 is the species formed rapidly after the pH was dropped below 4.0. The prosegment is unfolded in intermediate 1 and theactive site of gastricsin is exposed and accessible to substrates. The first hydrolytic event detected during the activation of progastricsinis the intramolecular cleavage of the Phe-26p to Leu-27p peptide bond (26). Subsequently, an intermolecular cleavage at the Leu-43p-Ser-1 peptide bond (the promature junction) results in the formation of transient intermediate 2 that can be stabilized by transferring thepH to neutrality (>6.5). The resulting molecular species has been characterized biochemically (26) and comprises residues Ala-1p toPhe-26p noncovalently associated with mature gastricsin (Ser-1 to Ala-329).

Intermediate 2 recently has been characterized structurally (17), and its structure is depicted in Fig. 1B. The β-strand (Val-3p toLys-8p) is in the same position as observed in the structure of progastricsin. In addition, the first helix (Ile-13p to Lys-20p) is intact andis very similarly oriented as it is in the zymogen structure. The two catalytic aspartates, Asp-32 and Asp-217, have a water moleculebound between them in the same position as the nucleophilic water observed in the native structures of all mature aspartic proteases(27). The S1 binding site is occluded, however, as the side chain of Tyr-9 still forms a hydrogen bond with the carboxylate of Asp-32.The segment Tyr-125 to Ala-136 has moved from its position in progastricsin

STRUCTURAL ASPECTS OF ACTIVATION PATHWAYS OF ASPARTIC PROTEASE ZYMOGENS AND VIRAL 3C PROTEASEPRECURSORS

10970

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 18: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

(Fig. 1A) to the position and conformation common among the mature enzymes whose structures have been solved (Fig. 1 B and C).The ion pair Arg-14p to Asp-11 in intermediate 2 at pH �6.5 stabilizes the N-terminal peptide, Ala-1p to Phe-26p, in its original

location in the zymogen preventing the N-terminal residues of gastricsin (Ser-1 to Ala-12) from adopting their final position in themature enzyme. On the other hand, intermediate 2 would be relatively short-lived at acid pH values <3.0. At these low pH values thecarboxylate of Asp-11 would be protonated, and therefore the ion pair with Arg-14p would be severely weakened. Prolonged exposureof intermediate 2 to an acid environment favors dissociation of the β-strand of the zymogen and its replacement by the N terminus ofmature gastricsin (Fig. 1C).

Progastricsin (Fig. 2I) has stabilizing electrostatic interactions between the positively charged residues of the prosegment and thenegatively charged groups of the gastricsin portion of the zymogen (16). These electrostatic interactions are weakened by the drop inpH that results in the protonation of the carboxylate groups of aspartic and glutamic residues (29–31). In particular, protonation of thecatalytic aspartate groups, Asp-32 and Asp-217, weakens the interactions with Lys-37p, allowing the uncoiling of the 310 helix andpermitting the diffusion of this segment away from the active site. The residues surrounding Lys-37p in progastricsin (Thr-29p toAsp-33p and Phe-40p to Leu-43p) all have substantially higher than average B factors, indicating that they are highly mobile and wouldeasily undergo conformational changes that would expose the preformed active site (Fig. 2II).

In contrast, the β-strand at the N terminus of the prosegment (Val-3p to Lys-8p) associates with the mature gastricsin throughhydrogen bonding and hydrophobic interactions. These forces are not pH dependent. With the helical regions of the prosegmentuncoiled (Fig. 2II) and the polypeptide from roughly Lys-11p to Ala-13 in a dynamic state of flux, eventually a sensitive peptide (e.g.,Phe-26p-Leu-27p) would diffuse to the preformed active site and intramolecular cleavage would occur (Fig. 2III). The resultingcleaved form of the zymogen (Fig. 2IV) is enzymatically active and also would be free to catalyze intermolecular cleavages that havebeen detected kinetically with pepsinogen (32). This is the likely fate of the bond at the prosegment to mature junction (Leu-43p-Ser-1); it is cleaved intermolecularly and the peptide Leu-27p to Leu-43p dissociates from the complex. The noncovalent complex ofAla-1p to Phe-26p bound to gastricsin (intermediate 2 or Fig. 2V) can be stabilized by returning the pH to neutrality. The final step(Fig. 2 V to VI) in the conversion process involves a dissociation of the β-strand and helical regions (Ala-1p to Phe-26p) of theprosegment from gastricsin and its replacement by the N-terminal residues of gastricsin.

The prosegments of the pepsinogens and the progastricsins have very similar sequences and three-dimensional structures. Thesequences of prosegments of other aspartic proteases are also similar to those of the gastric enzymes, suggesting that the generalfeatures of the conversion process are shared among the chymosins and cathepsins D. Differences in the sites of internal cleavage andthe kinetics of the activation process (33) are explained partly by the positions of the cleavage sites in the different prosegments (34).

Conversion of Proplasmepsin II

The plasmepsin system presents a different view of aspartic protease activation than do the gastric proteases. Plasmepsin is theaspartic protease used by the malaria parasite Plasmodium to degrade hemoglobin in red blood cells. The plasmepsins are synthesizedas inactive zymogens, the proplasmepsins, having N-terminal prosegments that differ both in sequence and in size from other knownaspartic protease zymogens. Proplasmepsin prosegments contain approximately 125 aa and lack sequence similarity with the archetypalgastric zymogen prosegments, which are typically about 45 residues long (35–37). Proplasmepsin prosegments also contain atransmembrane helix that anchors these zymogens to the membrane during delivery from the endoplasmic reticulum to the digestivevacuole where activation and hemoglobin digestion occur (38). Activation of proplasmepsin in vivo is carried out by a maturase atacidic pH (38). Additionally, proplasmepsin II and P. vivax proplasmepsin can be activated autocatalytically at low pH, with thecleavage occurring upstream of the wild-type mature N terminus (39).

The crystal structure of proplasmepsin II from P. falciparum revealed some surprising contrasts with the gastric aspartic proteasezymogens (40). Instead of blocking a preformed active site, as in the gastric zymogens, the prosegment in proplasmepsin causes amajor distortion of the molecule, preventing the formation of a functional active site.

The recombinant proplasmepsin II used in the crystallographic studies had the prosegment truncated by the first 76 residues tofacilitate expression (39). Almost the entire length of this shortened prosegment interacts with the mature portion of proplasmepsin II.The prosegment has a well-defined secondary structure, consisting of an initial β-strand, followed by two α-helices and a coilconnection to the mature N terminus (Fig. 3). As in the gastric zymogens (15, 16, 22), the prosegment β-strand participates in the six-stranded β-sheet, the central motif of aspartic proteases, and becomes replaced by the mature N terminus upon activation (Fig. 3).Although the position of the prosegment β-strand is similar to that seen in gastric zymogens, the remainder of the prosegment adopts avery different disposition. Instead of running through the substrate-binding cleft, the two helices interact exclusively with the C domainof the molecule. The promature junction is located in a tight loop comprised of residues Tyr-122p to Asp-4, the Tyr-Asp loop, whereAsp-4 plays a key role in maintaining the structure of the loop (with hydrogen bonds to Tyr-122p and Ser-1) and anchoring it to the Cdomain (with hydrogen bonds to Lys-238 and Phe-241) (Fig. 4a).

The N terminus of the mature plasmepsin sequence differs in conformation between proplasmepsin II and plasmepsin II (40, 41).Upon activation, residues 1–14 undergo a large conformational change, placing Asp-4 to Phe-11 into the central β-sheet. Residues 15–29 make a more subtle rearrangement that alters their interactions with the active site Psi loops (Fig. 4 b and c). When the central β-sheet motif and C domain (residues 138–329) of plasmepsin II and proplasmepsin II are superimposed, their N domains (residues 30–129) are related by a rotation of 14° about an axis running roughly in the plane of the central β-sheet and perpendicular to the strands.This domain movement observed in proplasmepsin II is novel in terms of its division into rigid bodies, magnitude, direction, and effecton activity (42). It renders the active site cleft more open in the zymogen than in the enzyme, severely distorting the geometry from thatof the active site in plasmepsin II. In proplasmepsin II, the active site Psi loops are farther apart relative to plasmepsin II (Fig. 4b).Asp-34 and Asp-214 are too far apart in the zymogen to carry out the general base activation of a nucleophilic water molecule. The so-called immature active site is therefore catalytically inactive, and upon activation must collapse to the fireman’s grip configuration thatdefines the active site in all aspartic proteases of known structure (43) (Fig. 4c).

The method of inactivation in proplasmepsin II is different from that observed in the gastric aspartic protease zymogens. Inproplasmepsin there is no positively charged moiety (such as Lys-36p in pepsinogen) to neutralize the charge repulsion between thecatalytic Asps at neutral pH (44). Instead, the two Asp residues are kept apart from each other and are engaged in a network ofhydrogen bonds both within and between the

STRUCTURAL ASPECTS OF ACTIVATION PATHWAYS OF ASPARTIC PROTEASE ZYMOGENS AND VIRAL 3C PROTEASEPRECURSORS

10971

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 19: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Psi loops (Fig. 4b). The function of the prosegment, together with the rearranged mature N terminus, is to maintain the molecule in theopen conformation, leaving the active site accessible but greatly distorted.

FIG. 3. A structural view of the conversion of proplasmepsin II (Left) (40) to plasmepsin II (Right) (41). The N and Cdomains are colored yellow, the central motif is green, the prosegment (with its helices labeled) is magenta, and the N-terminal 30 aa of the mature sequence are cyan. The catalytic aspartic acid residues (34 and 214) are colored red. The tip ofthe flap in proplasmepsin II, which is disordered in the crystal structure, is shown as a dotted line. The peptide bonds cleavedin autoactivation (112p–113p) and in the maturase-assisted activation (124p–1) are marked by asterisks in proplasmepsin II.

The structure of proplasmepsin II suggests that disruption of three salt bridges (Glu-87p with Arg-92p, Asp-91p with His-164, andGlu-108p with Lys-107p) at low pH plays a key role in autoactivation. Dissociation of these interactions at low pH should destabilizethe prosegment structure and weaken the association between the prosegment and the C domain. The most dramatic effect ofacidification, however, should occur at Asp-4. This residue keeps the promature junction locked in the compact Tyr-Asp loop andtethers this loop to the C domain (Fig. 4a). Protonation of the Asp-4 side chain should disrupt the interactions of its carboxylateoxygens (with Tyr-122p, Lys-238, and Phe-241), opening up the Tyr-Asp loop and introducing a slack of five residues into theprosegment harness. With this region of the prosegment loosened, the molecule may adopt the domain-closed form with a functionalactive site. It should be noted that the bond cleaved in autoactivation of proplasmepsin II, Phe-112p to Leu-113p, is located at the Cterminus of the prosegment helix 2, which must be one of the early locations to lose its secondary structure upon acidification. Once theactive site is formed, the scissile bond, now in an extended conformation, then can be presented for cleavage either in cis or in trans.

FIG. 4.(a) The promature junction in proplasmepsin II. The hydrogen bonding network of Asp-4, within the Tyr-Asp loop(Tyr-122p-Asp-4) and to the C-domain residues Lys-238 and Phe-241, is shown, (b) The immature active site inproplasmepsin II. The Psi loops (31–41 and 211–220) interact with each other through direct and water-mediated hydrogenbonds. In addition, both Psi loops form hydrogen bonds with the N-terminal residues 11–17. (c) The active site of plasmepsinII, showing the symmetrical arrangement of hydrogen bonds around the catalytic aspartates known as the fireman’s grip. Thesphere between Asp-34 and Asp-214 is the oxygen atom from pepstatin that was present in the crystal structure (41). The N-terminal residues 15–18 and their interactions with the Psi loops also are shown.

STRUCTURAL ASPECTS OF ACTIVATION PATHWAYS OF ASPARTIC PROTEASE ZYMOGENS AND VIRAL 3C PROTEASEPRECURSORS

10972

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 20: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

FIG. 5. A structural model for the N-terminal, autocatalytic excision of the enteroviral 3C proteases. The model is based onthe crystal structure of the poliovirus 3C protease (HPV 3C) (63). The secondary structure of HPV 3C is shown in a ribbonrepresentation. The N-terminal β-barrel domain is blue and the C-terminal β-barrel domain is mauve. (I) Model of theprecursor of the 3C protease with the 3B|3C cleavage site sequence bound in the active site in the conformation of a cognatesubstrate of the 3C protease. The model was derived from the P5 to p2� residues of OMTKY3 bound in the active site ofStreptomyces griseus protease B (SGPB) (65) after the optimal superposition of HPV 3C and SGPB (63). Included in thisfigure are the residues starting at the P5 (Thr) position of the 3B protein. The P5 to P1 residues and residues 1–5 of HPV 3C(P1' to P5') are colored gray; residues 6–13 are dark gray. The side chains of three active site residues, the nucleophileCys-147 (yellow), the general acid-base catalyst His-40 (blue), and the S1 specificity determinant His 161 (light blue), areincluded. Residues 1–11 of HPV 3C reach into the active site of the protease and are in a mostly extended conformation.After the intramolecular cleavage the new N terminus Gly-1 dissociates from the active site while the P5 to P1 residues arestill bound (II). Subsequently residues 6–13 of HPV 3C fold into a stable α-helix (colored black), which prevents the new Nterminus from binding again to the active site and renders the conformational change irreversible. Arg-13 of the conservedsequence motif K/RR/KNL/I, which forms the last turn of the N-terminal helix in HPV 3C, anchors the N terminus to thestructure. (III) The crystal structure of HPV 3C is shown with the N-terminal α-helix in black. The rearrangement of the Nterminus in this model is accompanied by small conformational changes of β-strands aI and bI of the N-terminal domain andthe loop (yellow) that connects β-strands aII and bII of the C-terminal domain.

Autoactivation of proplasmepsin II takes place readily between pH 3.8 and 4.7 (45). The lower pH range covers the pKas of Aspand Glu carboxylates in proteins (46), even taking into account some pKa depression that may be expected because of these residues’participation in salt bridges and hydrogen bonds. For instance, the involvement of Asp-4 in a number of hydrogen bonds and a saltbridge (Fig. 4a) is likely to lower its side-chain pKa relative to that of a solvent-exposed carboxylate.

The requirement for low pH for activation by a maturase is less conclusive based on the proplasmepsin II structure. The promaturejunction is located at the surface of the molecule and therefore should be accessible to the external maturase. Acidification may benecessary to induce the Tyr-Asp loop opening for the Gly-124p to Ser-1 scissile bond to assume an extended conformation suitable forproteolytic cleavage. Alternatively, low pH may be required if the maturase itself has an acidic pH optimum. Further studies of thematurase will be needed to resolve this issue.

Autocatalytic Excision of Picornaviral 3C Proteases

Picornaviruses constitute a large family of positive-sense, single-stranded RNA viruses (47). An early and important step in thepicornaviral lifecycle is the translation of the single-stranded viral RNA genome into a single large polyprotein (48,

STRUCTURAL ASPECTS OF ACTIVATION PATHWAYS OF ASPARTIC PROTEASE ZYMOGENS AND VIRAL 3C PROTEASEPRECURSORS

10973

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 21: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

49). The viral polyprotein is processed into the individual viral gene products by the viral 3C protease, itself a part of the polyprotein(50). The picornaviral 3C protease is the prototype of the new class of chymotrypsin-like cysteine proteases (49, 51, 52). It can cleaveitself out of the viral polyprotein in cis and in trans, when it is expressed as part of the polyprotein or separately (53, 54).

The autocatalytic excision is not correlated with concomitant development of the proteolytic activity. The precursors of the 3Cgene product already have proteolytic activity (55–58). In the viruses of the genus enterovirus the precursor 3CD (the 3D gene productconstitutes the RNA-dependent RNA polymerase) shows a proteolytic activity that is distinct from that of 3C. In poliovirus, andpresumably in most other enteroviruses, the proteolytic activity of the precursor 3CD is required for the efficient processing of thecapsid precursor proteins (58). In hepatitis A virus the 3ABC gene product appears to be an important, proteolytically activeintermediate of the polyprotein processing (57).

Palmenberg and Rueckert (59) examined the kinetics of the polyprotein processing in the picornavirus, encephalomyocarditisvirus. Their data suggest that the autocatalytic excision of the 3C gene product can be a truly intramolecular event. Further evidence foran intramolecular excision of the 3C protease from poliovirus was provided by Hanecak et al. (60). Taken together, these data suggestthat the autocatalytic excision of 3C from the polyprotein at both the N and C termini can be either intramolecular or intermolecular.

Once atomic resolution structures of 3C proteases became available (61–63), it was possible to develop structural models for theautocatalytic excision of the picornaviral 3C proteases. The crystal structures confirmed that the picornaviral 3C proteases arestructurally related to the chymotrypsin family of serine proteases. Based on some of the unique structural details, the authors of thecrystal structure papers (61–63) proposed similar models for an autocatalytic, intramolecular cleavage at the N terminus of 3C. Theyalso agreed that it is much less obvious how an intramolecular cleavage could occur at the C terminus of 3C.

The N-terminal residues of the picornaviral 3C proteases form a stable α-helix that precedes the first strand of the N-terminal β-barrel domain. This helix packs against the surface of the C-terminal domain of 3C. The last turn of this α-helix is formed by theresidues of a highly conserved sequence motif K/RR/KNI/L (48).

Another unusual feature of the picornaviral 3C proteases is an antiparallel β-ribbon that extends from the C-terminal β-barrel (49).It forms an extension of the second and third β-strands of the C-terminal domain and corresponds topologically to the methionine loopof the chymotrypsin-like serine proteases. This feature is also present in the bacterial serine proteases such as α-lytic protease andStreptomyces griseus protease B (64, 65). The recent crystal structure of α-lytic protease complexed with its prosegment (2) revealedthat this feature plays an important role in the folding of the protease and in the autocatalytic, intramolecular processing of theprecursor of α-lytic protease.

The structural model of an intramolecular cleavage at the N terminus of the picornaviral 3C proteases (Fig. 5) predicts that the N-terminal α-helix folds to its final conformation only after the 3C protease has cleaved its own N terminus (61–63). Before theintramolecular cleavage at the 3B|3C site, the corresponding residues [Gly-1 to Lys-12 in human polio virus (HPV) 3C] must be in anextended conformation (Fig. 5I) to reach into the active site through the cleft between β-strand bI from the N-terminal domain and theloop connecting β-strands aII and bII from the C-terminal domain. The loop that connects β-strands aII and bII had to be moved in themodel of the precursor molecule (Fig. 5I), with respect to its position in the native HPV 3C protease structure (Fig. 5III) toaccommodate this. Several residues from β-strand aI also are slightly moved away from their positions in the structure of the native 3Cprotease to widen the cleft between the N- and C-terminal domains through which the N terminus passes.

After the autocatalytic cleavage at Gly-1 of HPV 3C the new N terminus dissociates out of the active site (Fig. 5II). The folding ofresidues 5–13 into a stable helix, which packs tightly onto the surface of the molecule, subsequently would render this conformationalchange irreversible. It is necessary to remove the new N terminus from the protease active site to prevent intramolecular, competitiveproduct inhibition of the protease.

The conserved sequence motif K/RR/KNI/L that eventually forms the last turn of the N-terminal helix anchors the residues of the N-terminal helix to the core structure of the protease. The side chains of Arg-13 and Asn-14 interact with the highly conserved sequencemotif KFRDI of the RNA-binding site of the 3C protease. The residues that will become the N-terminal helix are in an extendedconformation in the precursor (Fig. 5I). The up-down side-chain pattern in this extended conformation places the small side chains ofAla-7, Ala-9, and Ala-11 (P7�, P9�, and P11�) into the cleft between the two domains of the proteases and the larger side chains ofresidues Tyr-6, Val-8, and Met-10 point to the surface. Larger side chains than alanine in positions 7, 9, and 11 would not have fittedeasily into the surface of the cleft. We suggest therefore that the three alanine residues are important for the conformation of the Nterminus in the precursor as well as for the formation of the N-terminal helix.

It is much more difficult to envision an intramolecular cleavage of the picornaviral 3C protease at its own C terminus. The crystalstructure of the core proteins from Sindbis and Semlicki forest viruses (66) show how an additional β-strand can reach from the Cterminus to the active site of a chymotrypsin-like protease; however, the unique antiparallel β-ribbon of the picornaviral 3C proteasesthat extends from β-strands bII and cII and interacts with the N-terminal domain would prevent this (Fig. 5III).

We thank Perry d’Obrennan for help in making Fig. 2. Mae Wylie has been very helpful in getting the manuscript into its finalpolished form. A.R.K. was supported by a Medical Research Council of Canada Studentship; N.K.-B. was the holder of an AlbertaHeritage Foundation for Medical Research Studentship. This work has been supported by the Medical Research Council of Canada andby Grant UO1AI38249 from the National Institute of Allergy and Infectious Diseases of the National Institutes of Health.1. Neurath, H. (1957) in Advances in Protein Chemistry XII, eds. Anfinsen, C.B., Jr., Anson, M.L., Bailey, K. & Edsall, J.T. (Academic, New York), pp.

319–386.2. Sauter, N.K., Mau, T., Rader, S.D. & Agard, D.A. (1998) Nat. Struct. Biol 5, 945–950.3. Khan, A.R. & James, M.N.G. (1998) Protein Sci. 7, 815–836.4. Davie, E.W., Fujikawa, K. & Kisiel, W. (1991) Biochemistry 30, 10363–10370.5. Herriott, R.M. (1939) J. Gen. Physiol. 22, 65–78.6. Vernet, T., Khouri, H.E., Laflamme, P., Tessier, D.C., GourSalin, B., Storer, A.C. & Thomas, D.Y. (1991) J. Biol. Chem. 266, 21451–21457.7. Aviles, F.X., Vendrell, J., Guasch, A., Coll, M. & Huber, R. (1993) Eur. J. Biochem. 211, 381–389.8. Nagase, H., Enghild, J.J., Suzuki, K. & Salvesen, G. (1990) Biochemistry 29, 5783–5789.9. Huber, R. & Bode, W. (1978) Acc. Chem. Res. 11, 114–122.10. Schechter, I. & Berger, A. (1967) Biochem. Biophys. Res. Commun. 27, 157–162.11. Bode, W., Schwager, P. & Huber, R. (1978) J. Mol. Biol. 118, 99–112.12. Becker, J.W., Marcy, A.I., Rokosz, L.L., Axel, M.G., Burbaum, J.J., Fitzgerald, P.M.D., Cameron, P.M., Esser, C.K., Hagmann, W.K., Hermies,

J.D. & Springer, J.P. (1995) Protein Sci. 4, 1966–1976.

STRUCTURAL ASPECTS OF ACTIVATION PATHWAYS OF ASPARTIC PROTEASE ZYMOGENS AND VIRAL 3C PROTEASEPRECURSORS

10974

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 22: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

13. Turk, D., Podobnik, M., Kuhelj, R., Dolinar, M. & Turk, V. (1996) FEBS Lett. 384, 211–214.14. Coulombe, R., Grochulski, P., Sivaraman, J., Menard, R., Mort, J.S. & Cygler, M. (1996) EMBO J. 15, 5492–5503.15. James, M.N.G. & Sielecki, A.R. (1986) Nature (London) 319, 33–38.16. Moore, S.A., Sielecki, A.R., Chernaia, M.M., Tarasova, N.I.& James, M.N.G. (1995) J. Mol. Biol 247, 466–485.17. Khan, A.R., Cherney, M.M., Tarasova, N.I. & James, M.N.G. (1997) Nat. Struct. Biol 4, 1010–1015.18. Fujinaga, M., Chernaia, M.M., Tarasova, N.I., Mosimann, S.C. & James, M.N.G. (1995) Protein Sci. 4, 960–972.19. Esnouf, R.M. (1997) J. Mol Graphics 15, 133–138.20. Merritt, E.A. & Murphy, M.E.P. (1994) Acta Crystallogr. D 50, 869–873.21. Taggart, R.T., Cass, L.G., Mohandas, T.K, Derby, P., Barr, P.J., Pals, G. & Bell, G.I. (1989) J. Biol Chem. 264, 375–379.22. Bateman, K.S., Cherney, M.M., Tarasova, N.I. & James, M.N.G. (1998) in The Aspartic Proteases: Retroviral and Cellular Enzymes, ed. James,

M.N.G. (Plenum, New York), pp. 259–263.23. Herriott, R.M. (1938) J. Gen. Physiol. 21, 501–540.24. McPhie, P. (1972) J. Biol Chem. 247, 4277–4281.25. Auer, H.E. & Glick, D.M. (1984) Biochemistry 23, 2735–2739.26. Foltmann, B. & Jensen, A.L. (1982) Eur. J. Biochem. 128, 63–70.27. Davies, D.R. (1990) Anna. Rev. Biophys. Chem. 19, 189–215.28. Al-Janabi, J., Hartsuck, J. & Tang, J. (1971) J. Biol Chem. 247, 4628–4632.29. Foltmann, B. (1981) Essays Biochem. 17, 52–84.30. Perlmann, G.E. (1963) J. Mol Biol 6, 452–464.31. Glick, D.M., Shalitin, Y. & Hitt, C.R. (1989) Biochemistry 28, 2626–2630.32. Marciniszyn, J., Huang, J.S., Hartsuck, J.A. & Tang, J. (1976) J. Biol Chem. 251, 7095–7102.33. Kageyama, T., Ichinose, M., Miki, K, Athauda, S.B., Tanji, M. & Takahashi, K. (1989) J. Biochem. (Tokyo) 105, 15–22.34. Dunn, B. (1997) Nat. Struct. Biol 4, 969–972.35. Dame, J.B., Reddy, R.G., Yowell, C.A., Dunn, B.M., Kay, J. & Berry, C. (1994) Mol Biochem. Parasitol. 64, 177–190.36. Berry, C., Dame, J.B., Dunn, B.M. & Kay, J. (1995) in Aspartic Proteases: Structure, Function, Biology, and Biomedical Implications, ed.

Takahashi, K. (Plenum, New York), pp. 511–518.37. Francis, S.E., Gluzman, I.Y., Oksman, A., Knickerbocker, A., Mueller, R., Bryant, M.L., Sherman, D.R., Russell, D.G. & Goldberg, D.E. (1994)

EMBO J. 13, 306–317.38. Francis, S.E., Banerjee, R. & Goldberg, D.E. (1997) J. Biol Chem. 272, 14961–14968.39. Hill, J., Tyas, L., Phylip, L., Kay, J., Dunn, B.M. & Berry, C. (1994) FEBS Lett. 352, 155–158.40. Khazanovich Bernstein, N., Cherney, M.M., Loetscher, H., Ridley, R.G. & James, M.N.G. (1999) Nat. Struct. Biol 6, 32–37.41. Silva, A.M., Lee, A.Y., Gulnik, S.V., Maier, P., Collins, J., Bhat, T.N., Collins, P.J, Cachau, R.E., Luker, K.E., Gluzman, I.Y., et al. (1996) Proc.

Natl. Acad. Sci. USA 93, 10034–10039.42. Sali, A., Veerapandian, B., Cooper, J.B., Moss, D.D., Hofmann, T. & Blundell, T.L. (1992) Proteins 12, 158–170.43. Fusek, M. & Vetvicka, V. (1995) Aspartic Proteases: Physiology and Pathology (CRC, New York), pp. 22–24.44. Sielecki, A.R., Fujinaga, M., Read, R.J. & James, M.N.G. (1991) J. Mol Biol 219, 671–692.45. Moon, R.P., Bur, D., Loetscher, H., D’Arcy, A., Tyas, L., Oefner, C., Grueninger-Leitch, F., Mona, D., Rupp, K, Dorn, A., et al. (1997) Eur. J.

Biochem. 244, 552–560.46. Tanford, C. (1962) Adv. Protein Chem. 17, 69–165.47. Rueckert, R.R. (1996) in Fields Virology, eds. Fields, B.N., Knipe, D.M., Howley, P.M., Channock, R.M., Melnick, J.L., Monath, T.P., Roizmann,

B. & Straus, S.E. (Lippincott-Raven, Philadelphia), pp. 609–654.48. Bergmann, E.M. & James, M.N.G. (1999) in Proteases of Infectious Agents, ed. Dunn, B. (Academic, San Diego), pp. 139–163.49. Bergmann, E.M. & James, M.N.G. (1999) in Handbook of Experimental Pharmacology, eds. von der Helm, K. & Korant, B. (Springer, Heidelberg),

in press.50. Palmenberg, A.C. (1990) Annu. Rev. Microbiol. 44, 602–623.51. Gorbalenya, A.E. & Snijder, E.J. (1996) Perspect. Drug Discovery Des. 6, 64–86.52. Ryan, M.D. & Flint, M. (1997) J. Gen. Virol. 78, 699–723.53. Harmon, S.A., Updike, W., Jia, X.-Y., Summers, D.F. & Ehrenfeld, E. (1992) J. Virol. 66, 5242–5247.54. Richards, O.C., Ivanoff, L.A., Bienkowska-Szewczyk, K., Butt, B., Petteway, S.R., Jr., Rothstein, M.A. & Ehrenfeld, E. (1987) Virology 161, 348–

356.55. Davis, G.J., Wang, Q.M., Cox, G.A., Johnson, R.B., Wakulchik, M., Datson, C.A. & Villarreal, E.C. (1997) Arch. Biochem. Biophys. 346, 125–130.56. Jürgensen, D., Kusov, Y.Y., Facke, M., Kräusslich, H.G. & Gauss-Müller, V. (1993) J. Gen. Virol. 74, 677–683.57. Probst, C., Jecht, M. & Gauss-Müller, V. (1998) J. Virol. 72, 8013–8020.58. Ypma-Wong, M.F., Dewalt, P.G., Johnson, V.H., Lamb, J.G. & Semler, B.L. (1988) Virology 166, 265–270.59. Palmenberg, A.C. & Rueckert, R.R. (1982) J. Virol. 41, 244–249.60. Hanecak, R., Semler, B.L., Ariga, H., Anderson, C.W. & Wimmer, E. (1984) Cell 37, 1063–1073.61. Bergmann, E.M., Mosimann, S.C., Chernaia, M.M., Malcolm, B.A. & James, M.N.G. (1997) J. Virol. 71, 2436–2448.62. Matthews, D.A., Smith, W.W., Ferre, R.A., Condon, B., Budahazi, G., Sisson, W., Villafranca, J.E., Janson, C.A., McElroy, H.E., Gribskov, C.L. &

Worland, S. (1994) Cell 77, 761–771.63. Mosimann, S.C., Chernaia, M.M., Sia, S., Plotch, S. & James, M.N.G. (1997) J. Mol Biol 273, 1032–1047.64. Fujinaga, M., Delbaere, L.T.J., Brayer, G.D. & James, M.N.G. (1985) J. Mol Biol 184, 479–502.65. Huang, K, Lu, W., Anderson, S., Laskowski, M., Jr. & James, M.N.G. (1995) Protein Sci. 4, 1985–1997.66. Tong, L., Wengler, G. & Rossmann, M.G. (1993) J. Mol Biol 230, 228–247.

STRUCTURAL ASPECTS OF ACTIVATION PATHWAYS OF ASPARTIC PROTEASE ZYMOGENS AND VIRAL 3C PROTEASEPRECURSORS

10975

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 23: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

The catalytic sites of 20S proteasomes and their role in subunitmaturation: A mutational and crystallographic study

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation,” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

MICHAEL GROLL*, WOLFGANG HEINEMEYERI†, SIBYLLE JÄGER†, TOBIAS ULLRICH*, MATTHIAS BOCHTLER*, DIETER H. WOLF†, AND ROBERT

HUBER*‡

*Max-Planck-Institut für Biochemie, D-82152 Martinsried, Germany; and †Institut für Biochemie, Universität Stuttgart, D-70569Stuttgart, Germany

ABSTRACT We present a biochemical and crystallographic characterization of active site mutants of the yeast 20S proteasome with the aim to characterize substrate cleavage specificity, subunit intermediate processing, and maturation. β1(Pre3), β2(Pupl), and β5(Pre2) are responsible for the postacidic, tryptic, and chymotryptic activity, respectively. Thematuration of active subunits is independent of the presence of other active subunits and occurs by intrasubunit autolysis. Thepropeptides of β6(Pre7) and β7(Pre4) are intermediately processed to their final forms by β2(Pup1) in the wild-type enzyme andby β5(Pre2) and β1(Pre3) in the β2(Pup1) inactive mutants. A role of the propeptide of β1(Pre3) is to prevent acetylation andthereby inactivation. A gallery of proteasome mutants that contain active site residues in the context of the inactive subunits β3(Pup3), β6(Pre7), and β7(Pre4) show that the presence of Gly-1, Thr1, Asp17, Lys33, Ser129, Asp166, and Ser169 is notsufficient to generate activity.

Proteasomes are essential, ubiquitous intracellular proteases that degrade a broad variety of cytoplasmic, nuclear, and membraneproteins that have been marked for degradation by the attachment of polyubiquitin chains (1–3). Eukaryotic proteasomes are largeprotein complexes with a molecular mass around 2,000 kDa, with a modular architecture (4, 5). The catalytic core of the molecule isthe 20S proteasome, a cylindrical particle that consists of four heptameric rings made from seven different subunits each, which arepresent in two copies and in unique locations so that the particle has overall 2-fold symmetry (1, 4–7). The yeast 20S proteasomesubunits fall into two different classes phylogenetically related to the two subunits α and β of the archaebacterial proteasome (8) andhave been named accordingly (7). The α-subunits are not catalytically active and form antechambers to the central cavity of the 20Scomplex that is built from the β-subunits. In Thermoplasma acidophilum proteasomes,all β-subunits are transcribed and translated fromone gene only and are expressed as precursors. In the process of particle maturation, aII copies of the β-subunit become active, so thattwo rings of seven catalytic sites each are formed on the inner walls of the central chamber. The N-terminal threonine residue isexposed by this processing activity as the nucleophile in peptide bond hydrolysis (9, 10). It will subsequently be referred to as Thr1,thus assigning negative integers to residues of the propeptide. Based on the crystal structure of the T.acidophilum 20S proteasome, thedistance between active site threonines was suggested as the molecular ruler that determines the length distribution of proteasomegenerated peptides (9).

A more complex picture for the mechanism of oligopeptide product generation was suggested by the crystal structure of the yeast20S proteasome (7). It contains seven different α- and β-type subunits arranged in unique locations (Fig. 1). Four β-type subunits areinactive because they contain either unprocessed [β3(Pup3) and β4(Pre1)] or intermediately processed propeptides [β6(Pre7) and β7(Pre4)]. The remaining three subunits β1(Pre3), β2(Pupl), and β5(Pre2) have N-terminal threonine residues, are active, and havespecificities determined largely by the nature of their S1 pockets (7). Specific mutants of the active β-type subunits have been isolated(11). They allowed the identification of different substrate specificities (11, 12) of the proteasome and led to a hypothesis for anintermolecular processing mechanism of inactive β-subunits.

Functional and structural analysis of the mutant proteasomes allows us to investigate substrate specificities, catalytic and autolyticmechanisms, and intermediate processing of propeptides. They also provide hints to the role of propeptides in proteasome maturationand enzymatic activity and help to clarify the mechanism by which peptide product length is controlled. They provide critical tests ofpossible allosteric interactions in the proteasome. A number of mutants of inactive subunits was generated to define the roles ofindividual residues for inactivity with the ultimate goal to activate those subunits.

MATERIALS AND METHODS

Protein Preparation and Analysis. Yeast strains that express mutant proteasomes were generated as described (11). Cells weregrown on a 51 scale, and the modified enzymes were purified as reported for the wild-type (7). 20S proteasomes were separated intosubunits by reversed phase HPLC. One-hundred-microgram samples were loaded on a RP60 Supersphere column (Merck). The columnwas washed with a gradient from 0 to 30% acetonitrile in 0.1% trifluoroacetic acid. Single subunits were eluted in a gradient from 30 to60% acetonitrile in 0.1% trifluoroacetic acid at a flow rate of 0.3 ml/min and at a back pressure of 140 bar (1 bar=100 kPa). Peaks wereidentified and propeptides characterized by N-terminal sequence analysis and mass spectrometry.

Crystals of 20S proteasome mutants from Saccharomyces cerevisiae were grown in hanging drops at 24°C as described (7). Thecrystals were frozen in a stream of cold nitrogen gas (90 K). Data were collected by using synchrotron radiation with λ=1.1 Å on theBW6 beamline at the Deutschen Elektronen-synchrotron Centre (Hamburg, Germany) (Table 3). The anisotropy of diffraction wascorrected by an overall temper-

‡To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.rcsb.org (PDB ID

code 1RYP).

THE CATALYTIC SITES OF 20S PROTEASOMES AND THEIR ROLE IN SUBUNIT MATURATION: A MUTATIONAL ANDCRYSTALLOGRAPHIC STUDY

10976

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 24: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

ature factor by comparing observed and calculated structure amplitudes by using X-PLOR (13). Electron density was averaged 10 timesover the 2-fold noncrystallographic symmetry axis by using MAIN (14). Model building was carried out with FRODO (15).

FIG. 1. (a) Topology of the yeast 20S proteasome. The active site threonine 1 residues are located at the inner wall of thecylindrical particle. (b) Scheme of the β-rings with given distances between the active site threonines.

RESULTS AND DISCUSSION

Subunit Processing. The topology of the yeast 20S proteasome is shown in Fig. 1a with the relevant distances between the activesites given in Fig. 1b. We have purified mutant yeast 20S proteasomes with a reduced number of active subunits carrying exchanges ofThr1 for Ala in β1 and β2. In β5, Lys33 was exchanged for Ala or Arg because β5T1A is not viable. Double mutants of β1 and β2 canbe made. Some of these mutants show reduced growth (11), but 20S proteasomes can be isolated. We have characterized the β-subunitschemically by Edman degradation and, in some cases, by mass spectrometry after separation of the individual subunits by HPLC(Tables 1 and 2).

The active subunits β1, β2, and β5 are processed autocatalytically and independently of each other. Inactivating β1 does not affectprocessing of β2 and vice versa. Similarly, the mutation of β5K33A and β5K33R leads to inactivity of β5 but has no effect onmaturation of β1 and β2. This is consistent with earlier findings (16, 17), including pulse-chase experiments, which demonstrate thatsubunit maturation occurs late in proteasome assembly (11, 16, 18–20) after the formation of 15S–16S proteasome precursor particles.These particles are believed to be half proteasomes. As the active sites in 20S proteasomes are nearly 30 Å apart from each other, itappeared not possible that the Gly-1Thr1 cleavage occurs by a neighboring subunit.

The data on β5 maturation are less straightforward to interpret. β5K33R has very low enzymatic activity but is autoprocessed.β5K33A is also inactive, but partially processed. We find clear electron density for the propeptide to residue Cys-8 in this mutant, butwe can isolate by HPLC and mass spectrometry also the autoprocessed species (Table 2). An explanation might be an exceptionallability of the Gly-1Thr1 bond under the strongly acidic conditions of sample preparation for mass spectrometry.

The β1 and β2 T1A exchange in both the single and the double mutants leads to a failure in autoprocessing and to the presence ofintact or intermediately processed propeptides of these subunits. In the β1T1A β2T1A double mutant, β1 has its full length propeptideattached, and β2 is -intermediately processed after Leu-15. β7 is cleaved after Ile-19. In β6, cleavage after Ala-17 and Thr-14 is found.Cleavage occurs after nonpolar residues, consistent with cleavage by β5. Cleavage sites are at a sufficient distance from residue 1 toreach the remaining active centers of β5 in the same ring for β6 and β7 and in the opposite ring for β2 (Fig. 1b).

In the single β1TlA-mutant, processing of β6 and β7 is as in the wild type, but β1 is cleaved after Arg-10, obviously by β2,whereas in β2T1A the β6 and β7 propeptides are longer than in the wild type. Here, β2 itself could not be characterized.

The inactive subunits β6 and β7 are intermediately processed by one of the active subunits. β6 is adjacent to β5 on the same ringand to β2 on the opposite ring but further away from β1 on both rings of the 20S proteasome (Fig. 1). The nine amino acid propeptidein the mature wild-type protein is too short to span the distance to either of the β1 subunits. Experimentally, we find that inactivating β1in the β1T1A-mutant, β5 in the β5K33A-mutant, and β1 and β5 in the β1TIA β5K33R mutant has no effect on the propeptideprocessing of β6. In contrast, a significantly longer propeptide remains attached to β6 in the β2T1A-mutant. We conclude that β6 isprocessed by β2. Cleavage occurs after His-10 (Table 1), consistent with the trypsin-like activity of β2. Because β2 in the same ring istoo far away to be reached by a nonapeptide Gln-9 to Gly-1, β2 of the opposite ring must be the subunit that processes β6.

In the case of β7, the situation is similar, but the subunits β5 and β1 swap roles. β7 is close to β2 on the opposite ring and to thesubunits β1 on both rings. β5 is too far away to be involved in the final maturation step. Experimentally, the wild-type propeptide of β7is found in the β1T1A, in the β5K33A mutant, and in the β1T1Aβ5K33R double mutant. In the β2T1A mutant, the cleavage that occursin the wild type is suppressed, identifying β2 as the responsible subunit in the wild type. The cut occurs after Asn-9, a residue for whichβ2 has some specificity (12). These data substantiate previous biochemical findings on β7 maturation in the β2T1A single and β1T1Aβ2T1A double mutant, which led to the hypothesis that inactive β-subunits are processed by the closest active neighbor subunit (11).

THE CATALYTIC SITES OF 20S PROTEASOMES AND THEIR ROLE IN SUBUNIT MATURATION: A MUTATIONAL ANDCRYSTALLOGRAPHIC STUDY

10977

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 25: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

The intermediately processed propeptides of β6 and β7 had been found in well defined locations in the molecular structure of thewild-type protein such that their N termini lie at the inner annulus of the β-subunit rings far removed from the sites of proteolyticcleavage defined here (7). The same holds for the intermediately processed β1 propeptide in β1TIA and in β1T1Aβ5K33R. It hasdefined electron density to Leu-9, which also lies at the inner annulus, not far (16 Å) from β6Gln-9 and β7Thr-8. These observationsindicate a major rearrangement of the propeptides after intermediate processing and fixation at the final sites seen in the crystalstructure. In β1T1A β2T1A, the full length propeptide of β1 and the intermediately processed propeptides of β2, β6, and β7 have welldefined electron density up to residues Met-19 (β1), Ala-14(β2), Gln-9(β6), and Thr-8(β7), respectively.

Table 1. Yeast 20S proteasome mutants prepared and analyzed by N-terminal sequencingβ1(Pre3) β2(Pup1) β3(Pup3) β4(Pre1) β5(Pre2) β6(Pre7) β7(Pre4)

Wild type Gly-1 Gly-1 Gly-1 His-10 Asn-9Thr1 Thr1 Met-9 Met-1 Thr1 Gln-9

β2(Pup1)�Thr-8β2(Pup1)�

β1(Pre3) Acetyl Gly-1 Gly-1 His-10 Asn-9without propeptide (mass*) Thr1 Met-9 Met-1 Thr1 Gln-9

β2(Pup1)�Thr-8β2(Pup1)�

β1(Pre3) Arg-10 Gly-1 Gly-1 His-10 Asn-9T1A Leu-9

β2(Pup1)Thr1 Met-9 Met-1 Thr1 Gln-9

β2(Pup1)�Thr-8β2(Pup1)�

β2(Pup1) Gly-1 Gly-1 Ala-17 Val-10T1A Thr1 XXX Met-9 Met-1 Thr1 Ser-16

β5(Pre2)�Asn-9β1(Pre3)�

β5(Pre2) Not viable Not viable Not viable Not viable Not viable Not viable Not viableT1Aβ5(Pre2) Gly-1 Gly-1 His-10 Asn-9K33A Thr1 Thr1 Met-9 Met-1 XXX

(mass*)Gln-9β2(Pup1)�

Thr-8β2(Pup1)�

β1(Pre3) Leu-15 Gly-1 Ala-17 Ile-19T1A Met-19 Ala-14 Met-9 Met-1 Thr1 Ser-16 Ala-18β2(Pup1) β5(Pre2); (mass*) β5(Pre2);T1A β5(Pre2)� β5(Pre2); β5(Pre2)�β1(Pre3) Arg-10 Gly-1 Gly-1 His-10 Asn-9T1A Leu-9 Thr1 Met-9 Met-1 Thr1 Gln-9 Thr-8β5(Pre2) β2(Pup1) β2(Pup1)� β2(Pup1)�K33Rβ2(Pup1)T1A Not viable Not viable Not viable Not viable Not viable Not viable Not viableβ5(Pre2)K33Rβ3(Pup3) Gly-1 Gly-1 Gly-1 His-10 Asn-9G1T Thr1 Thr1 Met-9 Met-1 Thr1 Gln-9

β2(Pup1)�Thr-8β2(Pup1)�

β6(Pre7) Gly-1 Gly-1 Gly-1 His-10 Asn-9G1T/ Thr1 Thr1 Met-9 Met-1 Thr1 Gln-9 Thr-8A129S/ β2(Pup1)� β2(Pup1)�A130G/ (mass*)H166D/V169Sβ7(Pre4) Gly-1 Gly-1 Gly-1 His-10 Asn-9R33K/ Thr1 Thr1 Met-9 Met-1 Thr1 Gln-9 Thr-8F129S β2(Pup1)� β2(Pup1)�

P1 and P1' cleavage sites of the processed subunits are given, and responsible active subunits are indicated, (mass*), a hint for comparison with analysisby mass spectroscopy in Table 2.

Implications for Cleavage Specificity. Two β subunits, β3 and β4, have propeptides of eight and one amino acids, respectively,which are too short to reach any catalytic site in the mature particle and are, indeed, not cleaved. The propeptides ofallother subunitsare longer, and processing intermediates are observed. The discussed mutants are defective in some of the final maturation steps andshow changes in the processing pattern. As shown above, most of the subunits responsible for these cleavages are defined and can berelated to cleavage specificities.

In the β1T1A-mutant, a nine-residue propeptide cleaved after Arg-10 is found, consistent with cleavage by β2 in the same ring,according to its tryptic specificity and distance. Processing is completely suppressed in the β1T1Aβ2T1A double mutant, and theβ1Met-19 N terminus is observed. In the β2TlA-mutant, autoactivation is suppressed, but the subunit could no longer be separated byHPLC. We were able to characterize the cleavage site of the propeptide of β2 in the double mutant β1T1A β2T1A between Leu-15 andAla-14. As the only active subunit left, β5 must be responsible for this cut, assigning to it branched chain amino acid preferring(BrAAP) specificity, consistent with previous studies (12).

In the case of β5, we have mutated Lys 33 to Ala and to Arg, abolishing activity. In the β5K33A mutant, the resulting propeptideof β5 is heterogeneous and could not be analyzed by Edman degradation. A fraction is found that is autolysed and has a Thr1 Nterminus. In the x-ray structure, however, there is defined density to Cys-8, indicating, that the major proportion is not autolysed.However, β5K33R is fully autolysed.

THE CATALYTIC SITES OF 20S PROTEASOMES AND THEIR ROLE IN SUBUNIT MATURATION: A MUTATIONAL ANDCRYSTALLOGRAPHIC STUDY

10978

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 26: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Table 2. Results of mass spectrometry of the subunits in the different yeast 20S proteasome mutants

β1(Pre3) β2(Pup1) β3(Pup3)– MET+Ac β4(Pre1)+Ac β5(Pre2) β6(Pre7) β7(Pre4)Wild type t: 21,494 t: 25,085 t: 22,514 t: 22,558 t: 23.300 t: 24,851 t: 25.919

e: 21,492 e: XXX e: 22,504 e: 22,559 e: 23,297 e: 24,850 e: 25.919β1(Pre3) t: 22,376 t: 25,085 t: 22,514 t: 22,558 t: 23.300 t: 24,851 t: 25.919T1A e: 22,374 e: XXX e: 22.514 e: 22,559 e: 23,296 e: 24,833

(–H2O)e: 25.920

β1(Pre3) t: 21,536 t: 25,085 t: 22,514 t: 22,516 t: 23.300 t: 24,851 t: 25.919without propeptide e: 21,539 e: XXX e: XXX e: 22,559 e: 23,303 e: 24,854 e: 25.921β2(Pup1) t: 21,494 t: XXX t: 22,514 t: 22,558 t: 23.300 t: 25,631 t: 26.033T1A e: 21,495 e: XXX e: 22.516 e: 22,559 e: 23,300 e: 25,547

(+H2O)e: 26,045

β3(Pup3) t: 21,494 t: 25,085 t: 22,559 t: 22,558 t: 23.300 t: 24,851 t: 25.919G1T e: 21,497 e: XXX e: 22,548 e: 22,560 e: 23,302 e: 24,856 e: 25.921β5(Pre2) t: 21,494 t: 25,085 t: 22,514 t: 22,558 e: 23.243 t: 24,851 t: 25.919K33A e: 21,496 e: XXX e: XXX e: 22,560 r: 23,246 e: 24,838

(–H2O)e: 25.920

β6(Pre7) t: 21,494 t: 25,085 t: 22,514 t: 22,558 t: 23.300 t: 24,851 t: 25.919G1T/A12 e: 21,496 e: XXX e: 22,516 e: 22,560 e: 23,301 e: 24,850 e: 25.9219S/A130G t�1:23,870/H166D/V e�1:23.872169Sβ1(Pre3) t: 23,547 t: 26,436 t: 22,514 t: 22,558 t: 23.300 t: 25,327 t: 26,832T1A/ e: 23,562 e: XXX e: 22,516 e: 22,560 e: 23,301 e: 25,341 e: 25.833β2(Pup1) t�2:25,631T1A e�2:25,629

t, theoretical; e, experimental;�1, additionally observed peak of autolysed β6;�2, additionally observed peak of partially processed β6, which was not foundby N-terminal sequencing and was not found in the β2T1A mutation.

β6 and β7 are processed to their final forms by β2 of the opposite ring. Therefore, we have analyzed the β2T1A-mutant forchanges in the cleavage pattern of β6 and β7 propeptide. In β6, the cut occurs between Ala-17 and Ser-16, as analyzed by Edmandegradation of an HPLC fraction. In the β1T1A β2T1A double mutant, a component with cleavage between Thr-14 and Pro-13 is foundby mass spectrometry. This bond must be hydrolyzed by β5, assigning small neutral amino acid preferring (SNAAP) specificity to β5.

In the β2T1A-mutant, β7 has one extra amino acid at the N terminus compared with the wild type. Inactivating β1 in addition to β2shifts the cleavage further upstream to Ile-19 Ala-18. We conclude that β1 and β5 cleave after Val-10 and Ile-19, respectively,demonstrating BrAAP activity for both subunits, consistent with the apolar character of the P1 pocket of β5. In the case of β1, weassume that the positive charge of Arg 45 at the base of its P1 pocket is compensated by a bound bicarbonate anion to allow binding ofneutral ligands, as had been observed before in the Leu-Leu-Norleucinal complex of the wild-type protein (7, 21).

FIG. 2. Stereodiagram of the superposition of β1 (green) and N-acetyl-β1 (yellow) around the Thr1 site. The structures matchclosely.

The Role of the β1 Propeptide. The propeptide of β5 has been shown to be essential for cell viability but is functional whenexpressed in trans, suggesting a chaperone-like role in proteasome biogenesis (20). To investigate the role of the propeptide of β1, wehave replaced its propeptide with ubiquitin. As in other linear ubiquitin fusions (22, 23), ubiquitin is cleaved by ubiquitin C-terminalhydrolases (24) to liberate the N-terminal threonine. The mutant proteasomes were inactive when assayed for postacidic cleavage(PGPH) activity. Their β1 subunit could be isolated by HPLC but was blocked for N-terminal sequencing. Structural analysis of themutant proteasomes showed no significant differences to the wild-type structure except for extra density at the amino group of Thr1that was interpreted as an acetyl group (Fig. 2) and confirmed by mass spectroscopy (Table 2). We conclude that the propeptide of β1has a role in preventing co- or posttranslational

THE CATALYTIC SITES OF 20S PROTEASOMES AND THEIR ROLE IN SUBUNIT MATURATION: A MUTATIONAL ANDCRYSTALLOGRAPHIC STUDY

10979

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 27: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

acetylation and inactivation of this subunit. The lack of enzymatic activity of the N-acetyl-β1 mutant supports the proposed mechanismof catalysis (9) assigning to the amino group of site Thr1 the role of the proton acceptor, but steric hindrance of substrate docking bythe acetyl group also may contribute to inactivity. It is noted that the acetyl group is not cleaved via of the conserved Lys33 is inmaintaining the appropriate structure and electrostatic potential in the vicinity of the active autolysis, probably for steric and electronicreasons. The role (7, 21).

Table 3. Crystallographic data of data collection and refinementβ2(Pup1) β5(Pre2) β1(Pre3) β1(Pre3) β3(Pup3) β6(Pre7) –5* β1(Pre3) without propeptide

Space group P21 P21 P21 P21 P21 P21 P21Cell a=136.7 α=135.6 α=135.4 α=135.5 α=135.5 α=135.7 α=135.9constants (Å/°) b=300.6 β=300.3 β=302.5 β=300.7 β=301.2 β=300.3 β=301.6

c=145.2 γ=144.0 γ=145.5 γ=144.4 γ=146.5 γ=144.6 γ=144.5β=113.1 β=113.0 β=112.6 β=112.9 β=112.8 β=113.2 β=112.7

Resolution, Å 50–2.5 50–2.5 50–2.7 50–1.9 50–2.9 50–1.95 50–2.9Observation, 2σ 606959 951542 702094 2181093 631736 1755406 600449Uniques 289028 343517 270036 752101 225640 731544 218.345Completeness 88.3 93.3 93.8 92.9 94.5 91.1 94.2Rmerge, % 14.4 12.9 12.8 11.9 12.2 12.1 13.6R/Rfree, % 30.3/36.5 26.0/31.2 27.5/36.4 26.8/33.0 21.1/27.3 28.5/32.1 22.7/297rms bonds, Å 0.012 0.012 0.012 0.011 0.011 0.011 0.012rms angles, ° 2.0 1.9 1.84 1.8 1.85 1.933 1.89

FIG. 3. (a) Stereodiagram of the β1T1A β5K33R double mutant in the vicinity of residue Thr1 in β5. The electron density iscalculated with phases from the wild-type β5 model, (b) Stereodiagram of wild-type (green) and β5K33R (white) mutantaround Thr1. They superimpose closely except for the site of mutation. β5K33R autolyses and has a free Thr1. (c)Comparison of the wild-type (green) and β5K33A (white) mutant. Loss of the Lys33 side chain leads to a large movement ofthe backbone of Thr1. The mutant is unable to autolyse and has the propeptide attached.

The Role of Lys33 in the Enzymatic Mechanism. The conservative exchange of Lys33 to arginine abolishes both autolysis andproteolysis in T.acidophilum proteasomes (25). We were particularly interested in this mutation because

THE CATALYTIC SITES OF 20S PROTEASOMES AND THEIR ROLE IN SUBUNIT MATURATION: A MUTATIONAL ANDCRYSTALLOGRAPHIC STUDY

10980

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 28: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

arginine in position 33 occurs naturally in the subunit β7 of the yeast proteasome, where it displaces the Thr1 side chain, leading toincompetence in autolysis and to enzymatic inactivity (7). We exchanged Lys33 in β5 of the yeast proteasome with arginine. Crystalsthat diffract to 1.9-Å resolution (Table 3) could be obtained with a double mutant, which additionally has Ala exchanged for Thr1 insubunit β1. In contrast to wild-type β7 and the quintuple mutant of β6 (see below), aII residues in the vicinity of the active site of β5,including Thr1, remain in their wild-type positions. The arginine residue has its side chain in the same orientation as the lysine residue,but its guanidino group is tilted with respect to the position of the amino group in the lysine residue to avoid a clash with Thr1 (Fig. 3 aand b). As in T.acidophilum proteasomes, the chymotryptic activity of this mutant against chromogenic substrates is abolished.However, in contrast to results obtained for the T.acidophilum proteasome, the propeptide in the yeast mutant is cleaved. We attributethis observation to a weak residual activity that suffices for autolysis during particle maturation. The mutant grows slowly at 30°C butnot at 37°C (11), and it overexpresses 20S proteasomes. The phenotype could be attributable either to the lack of chymotryptic activityor to delayed or impaired proteasome maturation. Genetic studies favor the latter explanation (11, 20). Because autolysis still occurs inthe β5K33R mutant, we analyzed the β5 mutant carrying the Lys33Ala mutation. The mutant strain was viable, although again it grewslowly and contained unusually high amounts of 20S proteasome. As expected, both autolysis and proteolysis did no longer occur. The2.5-Å crystal structure of this mutant shows defined density for the propeptide and a major rearrangement of the position of Thr1 thatfills the cavity created by the loss of the lysine residue and displaces Met45 (Fig. 3c). Mass spectrometry of a fraction separated byHPLC, however, showed also the presence of some correctly processed species (Table 2).

FIG. 4. A gallery of superposition of main chain traces around Thr 1. a and b show the three active subunits β1, β2 and β5. Inc and d, β1 is compared with wild-type β3 and β3G1T, respectively, in e and f, β1 is superimposed with wild-type β6 and the5-fold β6 mutant (β6*), and, in g and h, β1 is compared with β7 and β4.

Some Structural and Functional Comparisons. A comparison of the refined molecular models of the mutants β1T1A, β2T1A,and β1TIA β2T1A showed no significant variation of subunit positions or backbone structures. The activity against chromogenicsubstrates of a particular subunit is insignificantly altered by the presence or absence of intact sites of other subunits. We hadpreviously shown that the covalent binding of a specific bound irreversible inhibitor of β2 has no significant influence on the PGPHand chymotryptic activity associated with β1 and β5 and does not show noticeable structural changes (26). Similarly, there is nomeasurable change in the activity and structure of β1 and β5 by strong binding of bifunctional reversible inhibitors to β2 (27). Also,yeast 20S

THE CATALYTIC SITES OF 20S PROTEASOMES AND THEIR ROLE IN SUBUNIT MATURATION: A MUTATIONAL ANDCRYSTALLOGRAPHIC STUDY

10981

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 29: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

proteasome with lactacystin bound to β5 shows no structural change compared with the unligated species (7). These results do notsupport the existence of allosteric interactions between the active sites in general and argue against interactions mediated byconformational equilibria in particular. We are aware, however, that crystal lattice forces may oppose ligand-induced conformationalchanges occurring in solution.

FIG. 5. Stereo diagram of the 5-fold β6 mutant in the vicinity of residue 1. The electron density is calculated with phasesfrom the wild-type β6 model (black bonds). The mutations A129S, G1T, and the rearrangement of K33 are clearly visible(red bonds).

Reactivation Studies. All proteasomal β-subunits are members of a family of proteins having diverged from a single ancestorpossibly similar to the archaebacterial β subunit. Nevertheless, only three subunits, β1, β2, and β5, are proteolytically active in yeastand higher eukaryotes The other β-subunits, β3, β4, β6, and β7, are inactive and unable to autolyse. β3, β4, and β6 lack the nucleophilicthreonine in position 1, and β7 has Arg33 and Phe 129 instead of Lys33 and Ser129, respectively, as the most conspicuous changes.

The conservation of backbone geometry and of the majority of residues making up the active site also in inactive proteasome β-subunits has prompted us to investigate the possibility of reactivating inactive subunits. We first chose the inactive subunits β3 and β6as promising targets for subunit activation experiments because of the close similarity of their backbone fold with the active subunitsβ1, β2, and β5 (Fig. 4 a-c and e).

β3. Gly1 replaces the canonical threonine in β3 as the most conspicuous exchange. It was mutated to threonine. The resultant yeaststrain is viable and does not show a growth phenotype. Purified proteasomes from this strain show a blocked N terminus as the wildtype. An antibody was raised against β3, and the migration of the mutant and of the wild-type subunit on denaturing SDS gels wascompared. No difference could be observed, implying that the propeptide was not cleaved and the subunit remains inactive. Massspectrometry confirms these results (Table 2). Additionally, we determined the crystal structure of this mutant, which, when comparedwith the wild-type β3-subunit, does not show major rearrangements and confirms that the propeptide is attached (Fig. 4 c and d).

β6. We repeated the experiment in an analogous manner with β6. Although this subunit has a severely impaired catalyticmachinery with Gly1, Ala129, His166, and Val169 instead of the cannonical Thr1, Ser129, Asp166, and Ser169, its backbonesuperimposes well with those of the active subunits, and the position of Lys33 is identical (Fig. 4e). Gly1 is shifted slightly towardLys33 compared with the active subunits. We have replaced Gly1, Ala129, His166, and Val169 by their equivalents in active subunits.In addition, we exchanged Ala130 with glycine because this residue is conserved inallthree active subunits, although its role in catalysisis not obvious. The 5-fold mutant is again viable, but it has a severe growth defect. In comparison with the wild type, cells are up to10×larger and express severalfold more proteasome, which could be purified and crystallized. The crystal structure analysis at 1.95-Åresolution shows defined electron density at β6 for all nine residues of the partially processed propeptide, but it is substantially lowerthan in the wild type and particularly blurred at residues Asn-2 and Gly-1. Temperature factors of the propeptide are very high. Also,the mass spectrum of the corresponding HPLC fraction showed the molecular weight of the Gln-9 species, but a component with themolecular weight, corresponding to the autolysed species, also occurs. We conclude that the mutant protein is partially autolysed.Residues Asp17, Ser129, Asp166, and Ser169 of the mutant subunit β6 are positioned as the corresponding residues in active wildtypesubunit, but Thr1 remains where Gly1 in the wild-type β6 subunit is. A close contact between the Thr1 and Lys33 side chains displacesthe lysine side chain into an outwardly oriented position, where it is stabilized by hydrogen bonds to Glu31 and Asp53 (Figs. 4f and 5).The distortion of Thr1 with respect to its position in active subunits prevents the binding of a water molecule in the vicinity of Thr1, asseen in active subunits (7). As the phenotype of the mutant is unlikely be accounted for by an extra proteasomal activity, we havelooked for other explanations. The major activities of wild-type proteasomes are present but somewhat reduced. Therefore, we suspecta decreased stability of the quintuple mutant. β6 is in contact with β5 and β7 in the same ring and β2 and β3 of the opposite ring.His166 and Val169 contact β5, β2, and β3 whereas Ala129 and Ala130 contact β7. As seen from the lack of a phenotype of the triplemutant, β6G1T A129S A130G, which presumably has a displaced lysine residue and impaired contacts with β7, and from the lack of aphenotype of the quadruple mutant β6A129S A130G H166D V169S, individual

THE CATALYTIC SITES OF 20S PROTEASOMES AND THEIR ROLE IN SUBUNIT MATURATION: A MUTATIONAL ANDCRYSTALLOGRAPHIC STUDY

10982

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 30: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

residue effects count to be weak. Only in the quintuple mutant, where contacts of β6 with all neighboring β-subunits are disturbed, is anotable phenotype seen.

β7. Based on our observation that the displacement of the mutationally introduced Thr1 with respect to its position in activesubunits could explain the failure to activate β3 and β6, and based on the perfect match of the polypeptide backbone around Thr1 of β7and of the active subunits (Fig. 4g), we then attempted to activate β7. Two residues have to be replaced, Arg33 and Phe129. Theresulting yeast strain was viable and indistinguishable from the wild-type. N-terminal sequencing of β7 revealed the presence of thewild-type propeptide. In the absence of a crystal structure, we can only suspect that the distortion in the backbone of wild-type β7 in theregion around Phe129, which we attribute mainly to unfavorable interactions with Asp 166 (Fig. 4g), is still present in the mutant andresponsible for the inactivity and inability to autolyse. We did not try to activate β4 because major differences between the Cα-traces ofthis subunit and of the active subunits exist (Fig. 4h).

Note Added in Proof. While this paper was in press, a publication by Arendt and Hochstrasser (28) appeared suggestingacetylation of β1, β2, and β5 subunits by genetic methods in mutants lacking the respective propeptides. These results are in agreementwith our findings in β1 by analytical methods.

We thank Silvia Körner and Frank Siedler (Max-Planck-Institut für Biochemie, Martinsried, Germany) for help with massspectrometry, Karlheinz Mann (Max-Planck-Institut für Biochemie, Martinsried, Germany) for help with N-terminal sequence analysis,and G.B. Bourenkow and H.Bartunik (DESY, Hamburg, Germany) for assistance with the x-ray experiments. TheSonderforschungsbereich 469 provided financial support. The work was furthermore supported by a grant from the DeutscheForschungsgemeinschaft (Bonn) and the Fonds der Chemischen Industrie (Frankfurt).1. Hilt, W. & Wolf, D.H. (1996) Trends Biochem. Sci. 21, 96–102.2. Hershko, A. & Ciechanover, A. (1998) Annu. Rev. Biochem. 67, 425–479.3. Hochstrasser, M. (1996) Annu. Rev. Genet. 30, 405–409.4. Baumeister, W., Walz, J., Zühl, F. & Seemüller, E. (1998) Cell. 92, 367–380.5. Peters, J.M., Cejka, Z., Harris, R.J., Kleinschmidt, J.A. & Baumeister, W. (1993) J. Mol Biol. 234, 932–937.6. Coux, O., Tanaka, K. & Goldberg, A.L. (1996) Annu. Rev. Biochem. 65, 801–847.7. Groll, M., Ditzel, L., Löwe, J., Stock, D., Bochtler, M., Bartunik, H.D. & Huber, R. (1997) Nature (London) 386, 463–471.8. Dahlmann, B., Kopp, F., Kuehn, L., Niedel, B., Pfeifer, G., Hegerl, R. & Baumeister, W. (1989) FEBS Lett. 251, 125–131.9. Löwe, J., Stock, D., Jap, B., Zwickl, P., Baumeister, W. & Huber, R. (1995) Science 268, 533–539.10. Seemüller, E., Lupas, A, Stock, D., Löwe, J., Huber, R. & Baumeister, W. (1995) Science 268, 579–581.11. Heinemeyer, W., Fischer, M., Krimmer, T., Stachon, U. & Wolf, D.H. (1997) J. Biol. Chem. 272, 25200–25209.12. Nussbaum, A.K., Dick, T.P., Keilholz, W., Schirle, M., Stevanovic, S., Dietz, K., Heinemeyer, W., Groll, M., Wolf, D.H., Huber, R., et al. (1998)

Proc. Natl. Acad. Sci. USA 95, 12504– 12509.13. Brunger, A. (1992) X-PLOR Version 3.1; A System for X-Ray Crystallography and NMR (Yale Univ. Press, New Haven, CT).14. Turk, D. (1992) Ph.D. thesis (Technical Univ.; Munich).15. Jones, T.A. (1978) J. Appl. Crystallogr. 15, 24–31.16. Schmidtke, G., Kraft, R., Kostka, S., Henklein, P., Frömmel, C, Löwe, J., Huber, R., Kloetzel, P.M. & Schmidt, M. (1996) EMBO J. 15, 6887–6898.17. Ditzel, L., Stock, D. & Löwe, J. (1997) Biol. Chem. 378, 239–247.18. Frentzel, S., Pesold-Hurt, B., Seelig, A. & Kloetzel, P.M. (1994) J. Mol. Biol. 236, 975–981.19. Nandi, D., Woodward, E., Ginsburg, D.B. & Monaco, J.J. (1997) EMBO J. 16, 5363–5375.20. Chen, P. & Hochstrasser, M. (1996) Cell 86, 961–972.21. Ditzel, L., Huber, R., Mann, K., Heinemeyer, W., Wolf, D.H. & Groll, M. (1998) J. Mol. Biol. 279, 1187–1191.22. Bachmair, A, Finley, D. & Varshavsky, A. (1986) Science 234, 179–186.23. Arfin, S.M. & Bradshaw, R.A. (1988) Biochemistry 27, 7979– 7984.24. Wilkinson, K.D. (1997) FASEB J. 11, 1245–1256.25. Seemüller, E., Lupas, A. & Baumeister, W. (1996) Nature (London) 382, 468–470.26. Loidl, G., Groll, M., Musiol, H.-J., Ditzel, L., Huber, R. & Moroder, L. (1999) Chem. Biol. 6, 197–204.27. Loidl, G., Groll, M., Musiol, H.-J., Huber, R. & Moroder, L. (1999) Proc. Natl. Acad. Sci. USA 96, 5418–5422.28. Arendt, C.S. & Hochstrasser, M. (1999) EMBO J. 18, 3575–3585.

THE CATALYTIC SITES OF 20S PROTEASOMES AND THEIR ROLE IN SUBUNIT MATURATION: A MUTATIONAL ANDCRYSTALLOGRAPHIC STUDY

10983

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 31: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

The structure of the human II-tryptase tetramer: Fo(u)r better orworse

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation,” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

CHRISTIAN P. SOMMERHOFF*†, WOLFRAM BODE‡, PEDRO J. B. PEREIRA‡, MILTON T. STUBBS§, JÖRG STURZEBECHER¶, GERD P.PIECHOTTKA*, GABRIELE MATSCHINER*, AND ANDREAS BERGNER‡

*Abteilung Klinische Chemie und Klinische Biochemie in der Chirurgischen Klinik und Poliklinik, Klinikum Innenstadt derLudwig-Maximilians-Universität, Nuβbaumstrasse 20, D-80336 Munich, Germany; ‡Abteilung für Strukturforschung, Max-Planck-Institut für Biochemie, Am Klopferspitz 18a, D-82152 Martinsried, Germany; §Institut für Pharmazeutische Chemie der Philipps-Universität Marburg, Marbacher Weg 6, D-35032 Marburg, Germany; and ¶Klinikum der Universität Jena, Zentrum für VaskuläreBiologie und Medizin, Nordhäuserstrasse 78, D-99089 Erfurt, Germany

ABSTRACT Tryptases, the predominant serine proteinases of human mast cells, have recently been implicated as mediators in the pathogenesis of allergic and inflammatory conditions, most notably asthma. Their distinguishing features,their activity as a heparin-stabilized tetramer and resistance to most proteinaceous inhibitors, are perfectly explained by the 3-Å crystal structure of human βII-tryptase in complex with 4-amidinophenylpyruvic acid. The tetramer consists of fourquasiequivalent monomers arranged in a flat frame-like structure. The active centers are directed toward a central pore whosenarrow openings of approximately 40 A× 15 Å govern the interaction with macromolecular substrates and inhibitors. Thetryptase monomer exhibits the overall fold of trypsin-like serine proteinases but differs considerably in the conformation of sixsurface loops arranged around the active site. These loops border and shape the active site cleft to a large extent andformallcontacts with neighboring monomers via two distinct interfaces. The smaller of these interfaces, which is exclusivelyhydrophobic, can be stabilized by the binding of heparin chains to elongated patches of positively charged residues on adjacentmonomers or, alternatively, by high salt concentrations in vitro. On tetramer dissociation, the monomers are likely to undergotransformation into a zymogen-like conformation that is favored and stabilized by intramonomer interactions. The structurethus provides an improved understanding of the unique properties of the biologically active tryptase tetramer in solution andwill be an incentive for the rational design of mono- and multi-functional tryptase inhibitors.

Human mast cell tryptases (EC 3.4.21.59) comprise a family of trypsin-like serine proteinases closely related in sequence that arederived from ≥3 nonallelic genes (1, 2). Tryptases (at least isoenzymes αI, βI, βII, and βIII) are highly and selectively expressed in mastcells and to a lesser extent in basophils (3, 4). Only β-tryptases, however, appear to be activated intracellularly and stored in secretorygranules (5, 6), accumulating to much larger amounts than any other of the granule-associated serine proteinases of leukocytes andlymphocytes. On mast cell activation, β-tryptases are secreted bound to heparin in diverse allergic and inflammatory conditions rangingfrom asthma and rhinitis to psoriasis and multiple sclerosis. Various studies performed in animals and humans have providedconsiderable evidence that tryptases are directly involved in the pathogenesis of asthma (7–9), a hypothesis also supported by apparentgenetic links of tryptases to airway reactivity (10, 11).

Several unique properties distinguish tryptases from other trypsin-like proteinases (reviewed in refs. 12 and 13). Most notably,tryptases are enzymatically active in the form of a noncovalently linked tetramer. The tetramer is stabilized by association withnegatively charged aminoglycans such as heparin or high ionic strength conditions in vitro. On dissociation, reversible only undercertain conditions, the monomers lose activity, apparently because of transition into a zymogen-like state (14, 15). This mechanism isthought to govern tryptase activity in vivo. With the exception of the “atypical” Kazal-type inhibitor leech-derived tryptase inhibitor(LDTI) (16, 17), human tryptases are resistant to inhibition by proteinaceous inhibitors. In accordance with their trypsin-like activity,tryptases efficiently hydrolyze a number of peptide substrates including the neuropeptides “vasoactive intestinal peptide” and “peptidehistidine methionine” (18). Few macromolecular substrates are cleaved, however, leading to the activation of prostromelysin,prourokinase, and the protein-ase-activated receptor-2 (19–21) and the inactivation of fibronectin and of the procoagulant functions ofhigh molecularmass kininogen and fibrinogen (22–24).

These distinguishing features are well explained by the crystal structure of the human lung βII-tryptase tetramer, whose overallarchitecture has been summarized recently (25). Here, we describe the identification of the tetramer within the crystal packing, thedetailed structure of the monomers, and their interactions in the tetramer. In addition, structural features likely to favor a zymogen-likeconformation of isolated monomers and models of the interaction with stabilizing heparin proteoglycans and inhibitors are presented.

Identification of the Relevant Tryptase Tetramer. In the x-y plane of the tryptase crystals, the tryptase-monomers are arrangedin flat rectangular tetrameric aggregates that form extended protein layers (Fig. 1a). Within these layers, each tetramer is rotated aboutthe crystallographic a- and b-axes by �7°, in agreement with the self-rotation function. The tetramers appear well separated from theirneighbors in one direction (x-direction in Fig. 1a) but are in somewhat closer contact in the perpendicular direction (y in Fig. 1a). In thez-direction, the tetramers are stacked along the crystallographic 41 screw axis. Because of the 7° tilt of each tetramer from the x–yplane, their projections (Fig. 1b) alternate between leaning to the left, being horizontal, and leaning to the right, respectively, giving riseto a 7° precession motion of the

†To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviations: APPA, 4-amidinophenylpyruvic acid; LDTI, leech-derived tryptase inhibitor.Data deposition: The atomic coordinates have been deposited in the Protein Data Bank, www.rcsb.org (PDB ID code 1A0L).

THE STRUCTURE OF THE HUMAN ΒII-TRYPTASE TETRAMER: FO(U)R BETTER OR WORSE 10984

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

β

Page 32: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

local (2-fold; see below) rotation axis along the crystallographic 41 screw axis. The largely complementary interaction surfaces betweenthe monomers of the tetramer are typical for intersubunit contacts, whereas neighboring tetramers interact with one another via muchmore usual crystal contacts. Thus, within a tetramer, monomer A (Fig. 2) interacts with monomers B and D via interfaces of sizes 540Å2 and 1,075 Å2, respectively (solvent inaccessible surface probed by using a sphere of 1.4-Å radius; Collaborative ComputationalProject No. 4 suite). In contrast, the four monomers of one given tetramer interact with monomers from neighboring tetramers viainterfaces of less than 280 Å2 (in the x-y plane) and 265 Å2 (along the z-axis), respectively. The contacts between tetramers include anumber of hydrogen bonds and six unique salt bridges and thus are qualitatively similar to those usually observed in typical crystalcontacts.

FIG. 1. Packing of the human βII tryptase crystal, (a) View along thez-axis showing one layer of tryptase molecules in the x-yplane. The tryptase monomers are grouped into tetrameric aggregates that form extended sheets. Each of these tryptasetetramers is clearly delimited from its neighbors in both directions. A “reference” tetramer is shown in red for simplicity, (b)View across the z-axis. In the z direction, layers of tetramers are stacked on each other along the 41 screw axis. The local 2-fold symmetry axis is tilted from the z direction by �7°, causing increased crystal-stabilizing contacts between layers stackedin the z-direction. One unit cell (82.9×82.9×172.9Å), occupied by four tryptase tetramers, is indicated by a white bordered box.

These packing considerations suggest that the tetramer emphasized in Fig. 1 represents the enzymatically active tetramer of humanβ-tryptase. This tetramer selection is supported by the finding that the six loops that deviate most from the structures of other trypsin-like proteinases are aII involved in forming monomer-monomer contacts within a tetramer. More important, this unique tetramerperfectly explains the distinguishing properties of tryptase in solution, e.g., the resistance to proteinaceous inhibitors other than LDTI,the unusual substrate specificity, and the stabilization by the binding of heparin-like glycosaminoglycans (see below).

Overall Tetramer Structure. In the tryptase tetramer, monomers (arbitrarily assigned A, B, C, and D in Fig. 2) are positioned atthe corners of a flat rectangular frame leaving a continuous central pore. The tetramer displays almost perfect 222 symmetry that,however, is not exact because of the crystallographically asymmetric environment and an imperfect internal packing (see below). Thehorizontal and the vertical 2-fold axes, which cross each other in the center of the tetramer, relate monomers A to B and C to D, or A toD and B to C, respectively. The third 2-fold symmetry axis relating monomers A to C and B to D is arranged virtually perpendicular tothe other 2-fold axes and runs almost through their point of intersection in the central pore.

The active centers of the four monomers are directed toward the central pore (Fig. 2). This pore exhibits a rectangular crosssection and is twisted by �30° about the tetramer axis. It possesses two narrow openings of dimension 40 Å×15 Å, and widens in itscentral part to a cross section of 50 Å×25 Å, just large enough for elongated peptides of the diameter of an α-helix to thread though theexits and to interact with the active sites. Both pore entrances are partially obscured by the 147-loops (see below), which project fromeach of the monomers but on alternative entrance sides, so that only two diagonally arranged active centers can be viewed directly(Fig. 2). With 33 basic (including 12 His residues) and 24 acidic residues per monomer, human tryptase exhibits an average percentageof charged residues comparable to related serine proteinases, but is only slightly positively charged at neutral pH. These charges are notevenly distributed along the molecular surface, however. Rather, negatively charged residues cluster preferentially on the inner pore-facing surface, conferring the pore with a quite negative electrostatic potential, and along the peripheral A–D (and B–C) edges. Incontrast, the A–B (and C–D) peripheries and one front side of the monomer surface are positively charged and probably are involved inheparin binding (see below and Fig. 6).

Monomer Structure. The tryptase monomer exhibits the typical β-strand-dominated fold seen in other trypsin-like serineproteinases. The core is made by two six-stranded

THE STRUCTURE OF THE HUMAN ΒII-TRYPTASE TETRAMER: FO(U)R BETTER OR WORSE 10985

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 33: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

β-barrels that are packed together and further clamped by three transdomain segments (Fig. 3). This core structure is covered by anumber of polypeptide loops, a short α-helical turn (Ala-55-Gly-66, not shown in Fig. 3a), and two regular α-helices, the so-called“intermediate helix” (Glu-164-Leu-173A) and the C-terminal helix (Arg-230-Val-242). The catalytic residues Ser-195, His-57, andAsp-102 (chymotrypsinogen numbering) are located in the junction between both barrels. The active-site cleft runs perpendicular tothis barrel junction. In the “standard orientation” shown in Fig. 3, this cleft runs approximately horizontally across the molecularsurface facing the viewer and is ready to accommodate and bind extended peptide substrates extending from left to right. One hundredsixty-two and 168 residues of the tryptase monomer are topologically equivalent to the archetypal proteinases chymotrypsin (26) andtrypsin (27), respectively, with an rms deviation of their α-carbon atoms of 0.65 Å for both comparisons. The numbering of the tryptaseresidues given in this article is predominantly based on the equivalence with chymotrypsinogen (28) and at only a few trypsin-characteristic sites on that with trypsin (27).

FIG. 2. Overall structure of the tryptase tetramer. The four monomers A, B, C, and D (clockwise) are shown as blue, red,green, and yellow ribbons, each surrounded by a semitransparent surface. The inhibitor molecules APPA are given as orangeCPK models, each binding into one of the four S1 specificity pockets.

FIG. 3. The tryptase monomer in standard orientation, i.e., as seen approximately from the middle of the central pore of thetetramer toward the active site of monomer A (represented by Ser-195, His-57, and Asp-102), (a) Ribbon representation of atryptase monomer. The amidino group of the APPA molecule interacts with Asp-189 in the S1 pocket. Ser-195 O-γ is boundcovalently to the APPA carbonyl group forming a hemiketal. The six unique surface loops of tryptase that surround the activesite and are engaged in intermonomer contacts are shown in special colors, namely (anticlockwise) the 147-loop (light blue),the 70- to 80-loop (yellow), the 37-loop (orange), the 60-loop (magenta), the 97-loop (green), and the 173-flap (red). Allother tryptase segments are given in dark blue. The side chains of the catalytic triad residues as well as Asp-143, Asp-145,and Asp-147 in the acidic 147-loop are shown as a ball-and-stick model. (b) Overlay of the structures of the tryptasemonomer and bovine trypsin, both given as ropes. The color-coding of tryptase is as in a, whereas trypsin is shown in gray.The most relevant deviations from the trypsin backbone appear in the colored loop regions of tryptase.

In detail, however, the topology of the tryptase monomers deviates significantly from these reference proteinases (Fig. 3b),probably more than any other trypsin-like serine proteinase. In particular, six surface loops that border and shape the active-site cleftare unique (Fig. 3a). These loops comprise the 147-loop (including the 152-“spur”), the 70- to 80-loop, the 37-loop, the 60-loop, the 97-loop, and the 173-loop (Fig. 3a). The 147-loop, which together with Gln-192 forms the rather acidic southern wall of the active-sitecleft, is shortened by one residue in its initial part, but contains a two-residue insertion (Pro-152-Pro-152A-cisPro-152B-Phe-153-Pro-154) in its proline-rich and hydrophobic 152-spur. The neighboring 70- to 80-loop to the east, which in the calcium-binding serineproteinases winds around a stabilizing calcium ion (27), is three residues shorter and more compact in tryptase. It is probably notdesigned for calcium binding, in spite of topologically similar liganding groups; Glu-70 and Asp-80, involved in a partially buried saltbridge cluster with Arg-34, are oppositely arranged to the two calcium-binding Glu residues in trypsin. The 37-loop, above the 70- to80-loop, possesses two additional residues (Pro-37A and Tyr-37B), which bulge away from the loop axis. The adjacent 60-loop, withfive inserted residues, turns away from the cleft abruptly to the north, where it kinks at cisPro-60A to approach the general main chaincourse of other serine proteinases. At position 69, a buried Arg replaces the Gly residue that is strictly conserved in most otherhomologous proteinases, allowing for a special conformation. Although the 97-loop, at the northern rim of the cleft, contains the samenumber of residues as other serine proteinases, it differs considerably in conformation. The N-terminal part is shortened by tworesidues between positions 96 and 97, thus placing Ala-97 in the position normally occupied by residue 99,

THE STRUCTURE OF THE HUMAN ΒII-TRYPTASE TETRAMER: FO(U)R BETTER OR WORSE 10986

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 34: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

whereas its C-terminal part makes an unusual extra helical turn before arriving at Asp-102. By far the largest insertion, with nineresidues, occurs in the 173-loop. After the unusually long three-turn intermediate helix, the 10 residues from His-173 to Val-173I forman exposed flap centered around the imidazole side chain of His-173.

With 245 amino acid residues, the tryptase monomer possesses 15 and 22 residues more than the B-chains of chymotrypsin andtrypsin, respectively. Compared with chymotrypsinogen, most of these extra residues present in all tryptases known so far are insertedin the 37-loop (two residues), the 60-loop (+5), the 147-loop (+1), the 173-loop (+9), at position 221A (+1) and at the C terminus (+1),whereas the 70- to 80-loop (–3) and the 214- to 220-loop (–1, as inalltrypsin-like serine proteinases) are shorter. On the reverse side,the largely hydrophobic cluster of four Trp residues (Trp-27, -29, -207, and -137) is noteworthy. Only the indole moieties of the lattertwo Trp are significantly exposed to the surface. At the C terminus, only the main chain atoms of the two penultimate residues Lys-244and Lys-245 are well defined by electron density, while the C-terminal Pro-246 could not be located. The side chain of the single N-linked sugar attachment site in human βII-tryptase, Asn-204, extends away from the molecular surface opposite to the active site. Someresidual electron density exists distal to its carboxamide group, which is not large enough to account for a covalently linked sugarresidue.

As found in almost all trypsin-like serine proteinases [except, e.g., single-chain tissue type plasminogen activator (29)], the N-terminal Ile-16-Val-17 segment is inserted in the Ile-16 pocket, forming a solvent inaccessible salt bridge between its free Ile-16 α-amino group and the carboxylate group of Asp-194. The formation of this salt bridge after activation cleavage creates a functionalsubstrate recognition site by reorienting the Asp-194 side chain from an external position in the zymogen, where it might hydrogenbond to a surface located His-40··· Ser-32 pair forming the so-called “zymogen triad,” to an internal position in the active molecule (30,31). This reorientation restructures the surrounding “activation domain,” which in trypsin(ogen) mainly includes the linings of theIle-16 pocket and the S1 specificity pocket (i.e., segments Ile-16-Gly-19, Tyr-184-Asp-194, Gly-216-Asn-223, and Gly-142-Tyr-151),and the “oxyanion hole” formed by the amide groups of Gly-193 and Ser-195 (28, 32, 33). The single-chain zymogen and the activatedmonomer are adequately described by a two-state model, in which an inactive conformation is in equilibrium with an active formpossessing a structured activation domain (31). The partition between both forms depends on environmental conditions such as theendogenous free Ile-16-Val-17 N-terminal segment (34), free Ile-Val dipeptide (31), ligands in the substrate binding site (30, 36), orother effectors such as fibrin with respect to tissue plasminogen activator or tissue factor in the case of Factor VIIa (29, 37). Thisconformational partition can be influenced by internal molecular groups that stabilize or destabilize one or the other state. Tryptasepossesses the zymogen triad residues His-40 and Ser-32, which would stabilize the zymogen state. In addition, the acidic residuesAsp-143, Asp-145, and Asp-147 arranged around the Ile-16 cleft could form a negatively charged anchoring site that could competewith the Ile-16 pocket for the Ile-16 α-amino group, thus destabilizing the structured active state of the tryptase monomer. Furthermore,some of the loops in contact with the activation domain of tryptase, such as the long 173-loop or the 70- to 80-loop, which has beenshown to be strongly correlated with the equilibrium state in bovine elastase “subunit III” (38), could influence the structured state. Theconformation of the tryptase 173-loop, probably held in place in the tetramer by contacts with monomer D, certainly has an effect onthe stability of the integrated monomer. Interestingly, tissue factor, thought to support insertion of the N-terminal Ile-16 α-aminoterminus of activated Factor VIIa B-chain on complex formation (37), likewise binds to the 173-loop at the intermediate helix flank (39).

Interfaces. All monomer-monomer contacts within the tetramer are realized via six loops arranged around the active center. Theseloops, emphasized by special colors in Figs. 3–5, differ fundamentally in their conformation and partly in size from those of othertrypsin-like serine proteinases. Monomers A and B interact with one another through the 147-loop, the 70- to 80-loop, and the 37-loop(Fig. 4d). Each 152-spur slots into a cleft formed by the 37- and the 70- to 80-loop of its own monomer and the 152-spur of theopposing neighbor. At the center of the interface, the side chains of Phe-153 and Tyr-75 from each subunit form an approximatetetrahedron (Fig. 5a). The side chain of Tyr-75 from monomer B (D) would clash with the equivalent A (C) side chain if they werearranged in a symmetrical manner. Instead, the phenolic group of Tyr-75 of monomer A turns in the opposite direction, breaking the 2-fold symmetry (see the partial electron density in Fig. 5a). This A–B (C–D) interface is exclusively hydrophobic, with a remarkablenumber of Tyr and Pro side chains involved, and lacks any intermonomer hydrogen bonds. Toward the pore, the side chains of the twoArg-150 residues oppose one another. The charges of their guanidyl groups presumably make unfavorable energy contributions to theA–B interaction.

FIG. 4. Loop arrangements in the tetramer. The six special loops engaged in monomer-monomer interactions are shown inthe color coding introduced in Fig. 3. (a) The D–A dimer as seen from outside of the tetramer along the local 2-fold axis, (b)The monomer viewed in standard orientation, (c) Front view of the tetramer. (d) The A–B dimer seen from outside of thetetramer along the local 2-fold axis.

Monomer A interacts with monomer D through the entire northern rim consisting of the 173-flap, the 97-loop, and the 60-loop(Figs. 4a and 5b), again via equivalent loops. Both 97-loops rest with their 95–99 segments on one another (Fig. 4a), with both Ile-99side chains in direct contact. Further toward both peripheries, segment Pro-60A-Asp-60B and the opposing segment Gly-173B-Tyr-173D run antiparallel to one another, forming two-rung antiparallel ladders between Gly-173B-Tyr-173D and Pro-60A-Val-60C(Fig. 5b). Each Tyr-95 aromatic side chain nestles into the bend of the opposing 173-flap, and each Tyr-173D phenolic side chain slotsinto a hydrophobic cleft made by the 60-loop and the 97-loop of the opposing monomer. In addition, both monomers are cross-connected by salt bridges between Asp-60B and Arg-224 and

THE STRUCTURE OF THE HUMAN ΒII-TRYPTASE TETRAMER: FO(U)R BETTER OR WORSE 10987

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 35: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

by four hydrogen bonds involving both main and side chains (Fig. 5b). Thus, the A–D (and the corresponding B–C) interface comprisesa number of polar/charged interactions in addition to several hydrophobic contacts.

FIG. 5. Stick representation of the contact interfaces between monomers, (a) The AB-interface seen from inside the tetrameralong the local 2-fold axis, shown together with the final 2F0-Fc electron density map for both Tyr-75 side chains contouredat 1 σ level. The monomers and loops are given in the color coding introduced in Figs. 3 and 4. (b) The AD-interface (halfside) observed approximately perpendicular to the local 2-fold axis, shown together withallintermonomer hydrogen bondsand salt bridges (green dots). Segments of monomers A and D are given in blue and yellow, respectively.

The A–B homodimer carries a number of positively charged residues at the periphery, which cluster and form an obliquelyoriented two-lobed patch of positive charges that extends toward one of the front sides of each monomer, giving rise to the blue-coloredelectrostatic potential surfaces in Fig. 6. With an overall length of almost 100 Å, this patch would allow tight electrostatic binding of anextended heparin chain of �20 sugars running obliquely along the A–B edge as shown in Fig. 6. The length of such heparin chains is ingood agreement with the experimentally observed stabilization of the tetramer by heparin fractions of molecular mass 5,500 Da andabove (40). On the peripheral surface of the A–D (and the corresponding B–C) homodimer, in contrast, positive charges are counter-balanced by adjacent negative ones.

Interaction with Substrates and Inhibitors. The immediate vicinity of the tryptase active site is quite similar in structure to thatof trypsin. The specificity S1 pocket, which opens to the west of the reactive Ser-195 (Fig. 3a), is virtually identical to that of trypsinand well suited to accommodate P1-Lys and Arg side chains. The 4-amidinophenylpyruvic acid (APPA) molecule inserts into thispocket in the same manner as in the complex with trypsin (41). Thus, its amidino group hydrogen is bonded to both Asp-189carboxylate oxygens, Gly-219 O and Ser-190 Oγ, and its phenyl ring is sandwiched between peptide planes 215–216 and 190–192.Ser-195 Oγ bonds to the carbonyl group of the tetrahedral pyruvate part of APPA (Fig. 3a), and hydrogen bonds to His-57 Nε. Asindicated by the relatively low equilibrium dissociation constant of the APPA-tryptase complex [Ki 0.71 µM; (42)], APPA fits well tothe tryptase active site. Toward the south of the active site of tryptase, the side chains of Asp-143, Asp-145, and Asp-147 protrude fromthe relatively flat and hydrophobic southern embankment (Fig. 3a). The resulting negative charge cluster provides a second anchoringpoint for dibasic synthetic tryptase inhibitors such as bis-benzamidines (17, 42, 43), allowing favorable interactions with a distal basicgroup such as in pentamidine. The structural basis of the unexpected high affinity of bifunctional inhibitors containing suitablyarranged adjacent imidazole moieties such as present in the inhibitor BABIM and closely related analogues (43, 44) has recently beenrevealed: two nitrogen atoms

THE STRUCTURE OF THE HUMAN ΒII-TRYPTASE TETRAMER: FO(U)R BETTER OR WORSE 10988

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 36: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

of the two methylene-connected benzimidazoles coordinate a zinc zinc that also binds to the active-site located Ser-195 Oγ and His-57Nε (44). The zinc-mediated binding enhancement of BABIM-like inhibitors is particularly large but not restricted to tryptase.

FIG. 6. Model of the binding of a 20-mer heparin-like glycosaminoglycan chain along the A–B edge of the tryptase-tetramer.The solid-surface representation of tryptase indicates positive (blue) and negative (red) electrostatic potential contoured from–4 kT/e to 4 kT/e. The heparin chain (green/yellow/red stick model) is long enough to bind to clusters of positively chargedresidues on both sides of the monomer-monomer interface, thereby bridging and stabilizing the interface which is exclusivelyhydrophobic in nature (see Fig. 5a).

Toward the east, the substrate-binding site of tryptase is not only bounded by the side chains of Tyr-37B and Tyr-74 of monomerA, but also by the Phe-153 benzyl group and the 152-spur of the neighboring monomer B. Thus, binding of extended substrate chains islimited to about P5� (Fig. 7). Toward the north, the 97-loop of monomer A borders the substrate binding region in a manner differentfrom most other serine proteinases, and together with the side chains of Phe-94, Ala-97, and Gln-98 of monomer D forms a projecting“canopy.” The S2 subsite underneath is open and larger than that of trypsin. The S3/S4 subsite above the Trp-215 indole moiety is fullyblocked by the side chain of Gln-98 and the phenolic group of Tyr-95 provided by monomer D. Toward the west, however, thesubstrate-binding site is bordered exclusively by segments of the D-monomer, in particular the His-57 imidazole ring and segment 57–60. Thus, the active centers of monomers A and D (B and C) are spatially close (distance �23 Å for the A–D pair) to each other in thetryptase tetramer, rendering the tryptase tetramer suitable for the specific binding of bifunctional inhibitors with relatively short spacers.

FIG. 7. View from the LDTI inhibitor (represented only by its reactive site loop P7 to P3�) toward the active-site cleft. The P1Lys residue is buried.

The central pore of tryptase restricts the size of accessible substrates and inhibitors considerably. For larger proteins such asfibronectin and the zymogens of stromelysin-1 and urokinase-type plasminogen activator, the cleavage sites must be extended into theactive sites. Docking experiments with C-terminally truncated prostromelysin-1 (45) and with single-chain tissue plasminogen activator(29) as a model for prourokinase show that the activation cleavage loops of these proproteinases must be extracted from their crystalstructures to allow binding in the tryptase active center. More flexible peptides, in contrast, could easily thread through the pore of thetetramer to be processed or destroyed. Flexible polypeptide chains with two distant basic residues, as in “vasoactive intestinal peptide”(18), might even dock to adjacent active sites simultaneously to produce fragments of distinct length.

The active centers of the tryptase monomers are also largely inaccessible for macromolecular inhibitors. The only exceptionknown is LDTI, an “atypical” Kazal-type inhibitor that is smaller than the classical members of this family (16). LDTI has been shownto bind to trypsin through its reactive-site loop (residues P4 to P4�) in a canonical manner (17, 46). In the model of the complex withtryptase monomer A, the four N-terminal residues preceding this binding segment could bend toward the south (with respect to Figs. 3and 7), leading to the juxtaposition of the basic Lys-I1-Lys-I2 amino terminus (with the suffix I identifying inhibitor residues) with thecarboxylate groups of Asp-143 and Asp-147 of monomer A. Alternatively, the two Lys residues could interact with Asp-60B ofmolecule D. The involvement of such electrostatic interactions is supported by the deleterious effect of deletions and substitutions ofthese basic residues on the affinity of LDTI toward tryptase but not trypsin (17). The LDTI reactivesite loop, running from Cys-I14(P5) to Pro-I22 (P4�; ovomucoid numbering), is relatively small compared with classical Kazal-type inhibitors, allowing good overallfit to the restricted substrate binding groove (Figs. 7 and 8a). Furthermore, its central helix is one turn shorter, so that it just fits into thecentral pore of the tetramer on canonical binding to the active site of monomer A with only a few narrow contacts of its molecularantipole, opposite to its reactive-site loop, with the 147-loop of monomer D. Docking of a second LDTI molecule is possible at theopposite active centers of either monomer B or monomer C (Fig. 8a). A slight collision between Cys-I56 and Gly-I28 of two boundLDTI molecules could be relieved by minor torsion in the proteinase-inhibitor interfaces, as observed for other canonically bindinginhibitors such as eglin c (46). Any such torsion in the LDTI molecule bound to monomer A would impose an opposing torsion in theLDTI molecule bound to monomer B, facilitating such a relaxation. The simultaneous binding of two LDTI molecules to the tetramer isin good agreement with experimental results showing �50% inhibition of the cleavage activity toward small chromogenic substrates bynanomolar LDTI concentrations (16). Modeling experiments with more elongated classical Kazal-type inhibitors or with theprototypical bovine pancre-

THE STRUCTURE OF THE HUMAN ΒII-TRYPTASE TETRAMER: FO(U)R BETTER OR WORSE 10989

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 37: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

atic trypsin inhibitor indicate strong collisions of their distal pole segments with the neighboring monomers D and B, in particular withthe 147-loops, explaining the observed inactivity of these inhibitors toward tryptase (Fig. 8b). The central portion of the two-domainmucous proteinase inhibitor (MPI=SLPI=HUSI-I) would clash with the A–D interface region of the tryptase tetramer if bound to theactive site of monomer A (Fig. 8c) via its inhibitorily active second domain (47). Similarly, elafin (=SKALP), an inhibitorcorresponding to the MPI second domain (48), should not be able to inhibit tryptase. The much larger plasma proteinase inhibitors areclearly far too bulky to fit into the narrow pore of the tryptase tetramer and gain access to one of the active centers.

FIG. 8. Models of the interaction of the human tryptase tetramer with proteinaceous inhibitors. The tryptase tetramers areshown as green ribbons. An inhibitor molecule (blue) is modeled into the active site of monomer A by superposition of theproteinase moiety of known proteinase-inhibitor complexes to a tryptase monomer. For LDTI and BPTI the target proteinasewas trypsin (17, 49), for MPI chymotrypsin (47). The active sites of the other tryptase monomers are occupied by APPAmolecules (orange). Parts of the inhibitors clashing with the structure of tryptase (i.e., a distance smaller than 1.5 Å betweenthe Cα-atoms of the respective molecules) are highlighted in red. (a) In addition to one molecule of the “atypical” Kazal-typeinhibitor LDTI bound to the tryptase monomer A a second molecule (shown in pink and yellow) can bind to the active site ofeither monomer B or C. (b) Bovine pancreatic trypsin inhibitor (aprotinin). (c) Human mucous proteinase inhibitor bound totryptase with its inhibitorily active second domain.

CONCLUSION

In summary, the structure of the βII-tryptase tetramer has been identified based on the four crystallographically independentquasiidentical monomers and the analysis of their arrangement within the crystal packing. With its frame-like architecture and its activecenters facing a narrow central pore, the resulting tryptase tetramer structure explains most of the distinct properties of the biologicallyactive tryptase tetramer in solution. The unusual substrate specificity, with a preference for peptidergic substrates, and the resistance toproteinaceous inhibitors other than LDTI are both caused by the limited accessibility of the active sites within the narrow central pore.The tetramer can be stabilized by heparin glycosaminoglycan chains larger than �20 sugar residues, a length required to bridge theweaker of the two distinct monomer-monomer interfaces. The loss of enzymatic activity on dissociation of the tetramer is caused bystabilization by internal molecular groups of a zymogen-like rather than the active state. Finally, the knowledge of the structure of theactive center of the monomer as well as of the distances between neighboring active sites allows the rational design of multifunctionalinhibitors. Such inhibitors that bind to more than one active center will ideally have potentiated affinity, conferring selectivity for thetryptase tetramer. Such inhibitors will be valuable as pharmacological tools to probe the pathophysiological function(s) of tryptases invivo and may have therapeutic potential against asthma and other mast-cell related disorders.

We are grateful to R.Huber and H.Fritz for their generous support. We thank D.Grosse and R.Mentele for their excellent help incrystallization and amino acid sequence analysis. This work was supported by Sonderforschungsbereich 469 of the University ofMunich, the Deutsche Forschungsgemeinschaft (STU 161, BO 1279), the Fonds der Chemischen Industrie, and programs BIO4-CT98–0418 and TMR ERBFXCT 98–0193 of the European Union.1. Miller, J.S., Westin, E.H. & Schwartz, L.B. (1989) J. Clin. Invest. 84, 1188–1195.2. Pallaoro, M., Fejzo, M.S., Shayesteh, L., Blount, J.L. & Caughey, G.H. (1999) J. Biol Chem. 274, 3355–3362.3. Schwartz, L.B., Irani, A.M., Roller, K., Castells, M.C. & Schechter, N.M. (1987) J. Immunol. 138, 2611–2615.4. Xia, H.Z., Kepley, C.L., Sakai, K., Chelliah, J., Irani, A.M. & Schwartz, L.B. (1995) J. Immunol 154, 5472–5480.5. Schwartz, L.B., Sakai, K., Bradford, T.R., Ren, S., Zweiman, B., Worobec, A.S. & Metcalfe, D.D. (1995) J. Clin. Invest. 96, 2702–2710.6. Sakai, K., Ren, S. & Schwartz, L.B. (1996) J. Clin. Invest. 97, 988–995.7. Caughey, G.H. (1997) Am.J.Respir. Cell Mol. Biol. 16, 621–628.8. Johnson, P.R. A., Ammit, A.J., Carlin, S.M., Armour, C.L., Caughey, G.H. & Black, J.L. (1997) Eur. Respir. J. 10, 38–43.9. Rice, K.D., Tanaka, R.D., Katz, B.A., Numerof, R.P. & Moore, W.R. (1998) Curr. Pharm. Des. 4, 381–396.10. De Sanctis, G.T., Merchant, M., Beier, D.R., Dredge, R.D., Grobholz, J.K., Martin, T.R., Lander, E.S. & Drazen, J.M. (1995) Nat. Genet. 11, 150–

154.11. Hunt, J.E., Stevens, R.L., Austen, K.F., Zhang, J., Xia, Z. & Ghildyal, N. (1996) J. Biol. Chem. 271, 2851–2855.12. Schwartz, L.B. (1994) Methods Enzymol. 244, 88–100.13. Caughey, G.H. (1995) Mast Cell Proteases in Immunology and Biology (Dekker, New York).14. Ren, S., Sakai, K. & Schwartz, L.B. (1998) J. Immunol. 160, 4561–4569.15. Selwood, T., McCaslin, D.R. & Schechter, N.M. (1998) Biochemistry 37, 13174–13183.16. Sommerhoff, C.P., Söllner, C., Mentele, R., Piechottka, G.P., Auerswald, E.A. & Fritz, H. (1994) Biol. Chem. Hoppe-Seyler 375, 685–694.17. Stubbs, M.T., Morenweiser, R., Stürzebecher, J., Bauer, M., Bode, W., Huber, R., Piechottka, G.P., Matschiner, G., Sommerhoff, C.P., Fritz, H., et

al. (1997) J. Biol. Chem. 272, 19931–19937.

THE STRUCTURE OF THE HUMAN ΒII-TRYPTASE TETRAMER: FO(U)R BETTER OR WORSE 10990

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 38: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

18. Tam, E.K. & Caughey, G.H. (1990) Am.J.Respir. Cell Mol Biol 3, 27–32.19. Gruber, B.L., Marchese, M.J., Suzuki, K., Schwartz, L.B., Okada, Y., Nagase, H. & Ramamurthy, N.S. (1989) J. Clin. Invest. 84, 1657–1662.20. Stack, M.S. & Johnson, D.A. (1994) J. Biol. Chem. 269, 9416–9419.21. Molino, M., Barnathan, E.S., Numerof, R., Clark, J., Dreyer, M., Cumashi, A., Hoxie, J.A., Schechter, N., Woolkalis, M. & Brass, L.F. (1997) J.

Biol. Chem. 272, 4043–4049.22. Lohi, J., Harvima, I. & Keski-Oja, J. (1992) J. Cell. Biochem. 50, 337–349.23. Little, S.S. & Johnson, D.A. (1995) Biochem. J. 307, 341–346.24. Schwartz, L.B., Bradford, T.R., Littman, B.H. & Wintroub, B.U. (1985) J. Immunol. 135, 2762–2767.25. Pereira, P.J., Bergner, A., Macedo-Ribeiro, S., Huber, R., Matschiner, G., Fritz, H., Sommerhoff, C.P. & Bode, W. (1998) Nature (London) 392,

306–311.26. Blevins, R.A. & Tulinsky, A. (1985) J. Biol. Chem. 260, 4264– 4275.27. Bode, W. & Schwager, P. (1975) J. Mol. Biol. 98, 693–717.28. Wang, D., Bode, W. & Huber, R. (1985) J. Mol. Biol. 185, 595–624.29. Renatus, M., Engh, R.A., Stubbs, M.T., Huber, R., Fischer, S., Kohnert, U. & Bode, W. (1997) EMBO J. 16, 4797–4805.30. Huber, R. & Bode, W. (1978) Acc. Chem. Res. 11, 114–122.31. Bode, W. (1979) J. Mol. Biol. 127, 357–374.32. Freer, S.T., Kraut, J., Robertus, J.D., Wright, H.A.T. & Xuong, N.H. (1970) Biochemistry 9, 1997–2009.33. Bode, W., Fehlhammer, H. & Huber, R. (1976) J. Mol. Biol. 106, 325–335.34. Hedstrom, L., Lin, T.Y. & Fast, W. (1996) Biochemistry 35, 4515–4523.35. Bode, W., Schwager, P. & Huber, R. (1978) J. Mol. Biol. 118, 99–112.36. Bolognesi, M., Gatti, G., Menagatti, E., Guarneri, M., Marquart, M., Papamokos, E. & Huber, R. (1982) J. Mol. Biol. 162, 839–868.37. Higashi, S. & Iwanaga, S. (1998) Int.J.Hematol. 67, 229–241.38. Pignol, D., Gaboriaud, C., Michon, T., Kerfelec, B., Chapus, C. & Fontecilla Camps, J.C. (1994) EMBO J. 13, 1763–1771.39. Banner, D.W., D’Arcy, A., Chene, C., Winkler, F.W., Guha, A., Konigsberg, W.H., Nemerson, Y. & Kirchhofer, D. (1996) Nature (London) 380,

41–46.40. Alter, S.C., Metcalfe, D.D., Bradford, T.R. & Schwartz, L.B. (1987) Biochem. J. 248, 821–827.41. Walter, J. & Bode, W. (1983) Hoppe-Seylers Z. Physiol. Chem. 364, 949–959.42. Stürzebecher, J., Prasa, D. & Sommerhoff, C.P. (1992) Biol Chem. Hoppe-Seyler 373, 1025–1030.43. Caughey, G.H., Raymond, W.W., Bacci, E., Lombardy, R.J. & Tidwell, R.R. (1993) J. Pharmacol. Exp. Ther. 264, 676–682.44. Katz, B.A., Clark, J.M., Finer Moore, J.S., Jenkins, T.E., Johnson, C.R., Ross, M.J., Luong, C., Moore, W.R. & Stroud, R.M. (1998) Nature

(London) 391, 608–612.45. Becker, J.W., Marcy, A. L, Rokosz, L.L., Axel, M.G., Burbaum, J.J., Fitzgerald, P.M., Cameron, P.M., Esser, C.K., Hagmann, W.K., Hermes, J.D.,

et al. (1995) Protein Sci. 4, 1966–1976.46. Bode, W. & Huber, R. (1992) Eur. J. Biochem. 204, 433–451.47. Grütter, M.G., Fendrich, G., Huber, R. & Bode, W. (1988) EMBO J. 7, 345–351.48. Tsunemi, M., Matsuura, Y., Sakakibara, S. & Katsube, Y. (1996) Biochemistry 35, 11570–11576.49. Huber, R., Kukla, D., Bode, W., Schwager, P., Bartels, K., Deisenhofer, J. & Steigemann, W. (1974) J. Mol. Biol. 89, 73–101.

THE STRUCTURE OF THE HUMAN ΒII-TRYPTASE TETRAMER: FO(U)R BETTER OR WORSE 10991

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 39: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Sonic hedgehog protein signals not as a hydrolytic enzyme but as anapparent ligand for Patched

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation,” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

NAOYUKI FUSE*†, TAPAN MAITI*†, BAOLIN WANG*, JEFFERY A. PORTER*‡, TRACI M. TANAKA HALL§¶, DANIEL J. LEAHY§, ANDPHILIP A. BEACHY*||

* Department of Molecular Biology and Genetics and §Department of Biophysics and Biophysical Chemistry, Howard HughesMedical Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21205

ABSTRACT The amino-terminal signaling domain of the Sonic hedgehog secreted protein (Shh-N), which derives from theShh precursor through an autoprocessing reaction mediated by the carboxyl-terminal domain, executes multiple functions inembryonic tissue patterning, including induction of ventral and suppression of dorsal cell types in the developing neural tube.An apparent catalytic site within Shh-N is suggested by structural homology to a bacterial carboxypeptidase. We demonstratehere that alteration of residues presumed to be critical for a hydrolytic activity does not cause a loss of inductive activity, thusruling out catalysis by Shh-N as a requirement for signaling. We favor the alternative, that Shh-N functions primarily as aligand for the putative receptor Patched (Ptc). This possibility is supported by new evidence for direct binding of Shh-N to Ptcand by a strong correlation between the affinity of Ptc-binding and the signaling potency of Shh-N protein variants carryingalterations of conserved residues in a particular region of the protein surface. These results together suggest that direct Shh-N binding to Ptc is a critical event in transduction of the Shh-N signal.

Hedgehog (Hh) proteins constitute a family of secreted signaling molecules that govern patterns of cellular differentiation duringembryogenesis (reviewed in refs. 1–3). The hedgehog (hh) gene was first identified and isolated in Drosophila, where its multiple rolesinclude patterning of larval segments and adult appendages. Vertebrate hh homologues also are involved in many aspects ofdevelopmental patterning. The Sonic hedgehog (Shh) member of this family, for example, is required for patterning of the neural tubeand other tissues (4).

Hedgehog protein biogenesis (reviewed in ref. 5) has been best studied for the Drosophila protein but very likely is similar forHedgehog proteins fromallspecies. After cleavage of an amino-terminal signal sequence on entry into the secretory pathway, the Hhprotein undergoes an intramolecular autoprocessing reaction that involves internal cleavage between the Gly-Gly residues of anabsolutely conserved GCF tripeptide (6, 7). The amino-terminal product of this cleavage, which is the species active in signaling (7),also receives a covalent cholesteryl adduct (8). Autoprocessing at this site and covalent linkage to cholesterol have been experimentallyconfirmed for the Shh protein (7–9). In Drosophila, a Hedgehog protein from a construct truncated at the internal site of cleavage isactive in signaling, but this protein is not spatially restricted in its signaling activity and therefore causes gross mispatterning andlethality in embryos (10). The autoprocessing reaction thus is required not only to release the active signal from the precursor but alsoto specify the appropriate spatial distribution of this signal within developing tissues, presumably through insertion of the cholesterylmoiety into the lipid bilayer of the plasma membrane. Recent studies also have revealed palmi-toylation of the amino-terminal cysteineof the amino-terminal signaling domain of the Shh secreted protein (Shh-N); the occurrence of this second lipid modification isregulated by autoprocessing and may also influence the activity and distribution of Shh-N (9).

The patterning of the ventral neural tube is thought to require an inductive signal from the underlying mesodermal cells of thenotochord (11). Shh protein is synthesized in the notochord and can induce differentiation of ventral cell types such as floor plate cellsand motor neurons from neural plate explants in vitro (12); a similar role for Shh in vivo is confirmed by a loss of these cell types inmice lacking Shh gene function (4). Shh protein thus appears to constitute the inductive patterning signal from the notochord, and invitro explant experiments have demonstrated a concentration-dependent response, with low concentrations of Shh-N protein inducingmotor neuron differentiation and higher concentrations inducing increasing numbers of floor plate cells, ultimately to the exclusion ofmotor neurons (12, 13). Shh-N protein at concentrations below those required to induce differentiation of motor neurons or floor platecells can repress expression of cell markers of dorsal neural tube, such as Pax-7 and Pax-3, in neural plate explants (14, 15). Thisrepression of dorsal cell markers is presumed to mediate the transition of naive neural plate cells into ventral progenitor cells, whichthen differentiate into motor neurons or ventral interneurons at later stages of embryogenesis. Thus, the concentration-dependentactivity of Shh-N has been proposed to regulate the dorso-ventral patterning of the developing neural tube.

Several components have been identified as candidates for receptor function in transduction of the Hh protein signal (reviewed inref. 3). The patched (ptc) gene, originally identified in Drosophila, encodes a multipass transmembrane protein (Ptc). ptc mutations inDrosophila embryos cause inappropriate activation of wingless gene expression, a phenotype opposite that of hh mutations, thussuggesting that ptc functions as a negative effector in hh signaling (16, 17). The observations that hh ptc double mutant embryosresemble ptc mutants and

†N.F. and T.M. contributed equally in this work.‡Present address: Ontogeny, Inc., Cambridge, MA 02138.¶Present address: National Institute on Environmental and Health Sciences, Research Triangle Park, NC 27709.||To whom reprint requests should be addressed at: Department of Molecular Biology and Genetics, Johns Hopkins University School

of Medicine, 725 North Wolfe Street PCTB-714, Baltimore, MD 21205. E-mail: [email protected] is available online at www.pnas.org.Abbreviations: Shh-N, the amino-terminal signaling domain of the Sonic hedgehog secreted protein; HNF-3β, hepatocyte nuclear

factor 3β; Ptc-CTD, Ptc with a truncation resulting in a 140-residue carboxyl-terminal deletion.

SONIC HEDGEHOG PROTEIN SIGNALS NOT AS A HYDROLYTIC ENZYME BUT AS AN APPARENT LIGAND FOR PATCHED 10992

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 40: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

that, in a ptc mutant background, ectopic Hh expression produces no further phenotypic effects, together suggest that the Ptc geneproduct acts downstream of Hh to regulate its signaling activity (16, 18, 19). Genetic epistasis studies further suggest that thesmoothened gene (smo), which encodes another transmembrane protein (Smo), functions downstream of ptc in the hh signaling cascade(reviewed in ref. 3). Because smo is required for hh signaling, it has been proposed that Smo activates the Hh pathway and that Ptcinhibits Smo activity. Genetic mosaic analysis in the Drosophila wing imaginal disc showed that Ptc has, in addition to a cell-autonomous negative effect on Hh signaling, an ability to sequester the Hh protein and prevent its movement to adjacent cells (20).

Vertebrate homologues of both ptc and smo genes have been identified (reviewed in ref. 3). Shh-N was found to bind to cellsexpressing Ptc or both Ptc and Smo, but not to cells expressing Smo alone (21, 22). Moreover, Ptc interacted with Smo independentlyof the presence of Shh-N, suggesting that the two transmembrane proteins form a complex. An integrated view of Drosophila geneticanalyses and biochemical studies of vertebrate homologues suggests a model in which the Ptc-Smo complex might function as Hhreceptor, with direct binding of Hh to Ptc releasing Smo activity from inhibition by Ptc. It must be noted, however, that thesebiochemical studies did not examine the role of a physical interaction between Shh-N and Ptc in activation of the Shh pathway. Inaddition, these biochemical studies did not exclude the possibility that Shh-N interacts not directly with Ptc but with another componentof a complex that includes Ptc, because the crosslinked binding complexes were extremely large and were not analyzed with regard totheir composition.

The model just described assumes a role for Shh-N as a ligand for a receptor. The crystal structure of the Shh-N protein, however,suggested an alternative possibility. This structure revealed a zinc ion coordinated in an arrangement remarkably similar to that ofthermolysin, carboxypeptidase A, and other zinc hydrolases (23). Even more striking is the remarkable similarity in folded structure ofa portion of Shh-N to the catalytic domain of D,D-carboxypeptidase from Streptomyces albus, a cell wall enzyme closely related instructure and activity to other bacterial enzymes involved in conferring vancomycin resistance (Fig. 1 B and D) (24, 25). Although thefunctional role of this putative hydrolase in Shh-N is not known, one possibility is that signaling requires Shh-N hydrolytic activity onas yet unknown substrates. Thus, several fundamental questions about the mechanisms of Shh-N signaling remain unanswered. DoesShh-N function as a ligand or as an enzyme? Is Ptc interaction with Shh-N direct and is this a critical event in transduction of the Shh-Nsignal? To illuminate these issues, we used the structure as the basis for design of mutations expected to abolish zinc hydrolase activitywithin Shh-N. We also used structure-based systematic mutagenesis to produce Shh-N proteins with alterations in evolutionarilyconserved surface residues and then compared the signaling activity of these altered proteins in neural plate explants to their capacityfor binding to Ptc-expressing cultured cells. We found that Shh-N signaling does not require catalytic activity and instead correlatescritically with the ability of Shh-N to bind directly to Ptc.

MATERIALS AND METHODS

Preparation of Recombinant Shh-N Mutant Proteins. Constructs for altered Shh-N were made by standard methods (26).Recombinant proteins were expressed in Escherichia coli and purified as described previously (12). To prepare the 32P-labeled Shh-Nprotein, a protein kinase A site tag (RRASV) was introduced at the carboxy terminus of Shh-N, and the tagged Shh-N wasphosphorylated in a reaction containing [γ-32P]ATP. Cy2-labeled recombinant Shh-N was prepared by using CyDye FluoroLinkReactive Dye (Amersham).

FIG. 1. A possible catalytic site in Shh-N. (A) Model for an apparent zinc hydrolase catalytic site derived from the crystalstructure of Shh-N (23). Glu-177 and His-135 residues are presumed to be essential for catalysis (see text), and His-141,Asp-148, and His-183 coordinate the Zn2+ ion. (B) Superimposed alpha-carbon traces of Shh-N (yellow) and D,D-carboxypeptidase from Streptomyces albus (green). The portion of these proteins displaying structural homology is drawn,with the Zn2+ ions shown as blue spheres. Residues within the structurally homologous portion of Shh-N that are altered inSC (four of six) and SD (two of three) (see text) are located in structurally diverged loops and are highlighted in red. (C)Coomassie blue staining of purified recombinant wild type (WT) and E177A (EA), H135A (HA) and double mutant (EH)Shh-N proteins resolved in SDS/PAGE (15%); molecular mass markers are indicated at left (kDa). (D) Structure-basedalignment of amino acid sequence from the portions of mouse Shh (mSHH) and Streptomyces albus D,D-carboxypeptidase(DD-C) shown in B. The residues involved in zinc coordination or hydrogen bonding of the water molecule are shown in darkblue, and other conserved residues are in light blue. Target sites for mutagenesis are indicated in green (for zinc hydrolasemutants) or red (for SC and SD mutants, see below).

Chicken Neural Plate Explant Culture. Chicken intermediate neural plate explant culture methods have been describedpreviously (12, 27). Neural plate explants were stained with either mouse anti-Pax-7 [PAX7, Developmental Studies Hybridoma Bank(DSHB)], rabbit anti-hepatocyte nuclear factor (HNF)-3β (K2, a gift from T.M.Jessell, Columbia University), or mouse anti-Islet-1(40.2D6, DSHB) antibodies.

Cell Culture for Ptc Expression. Fragments encoding full-length mouse Ptc and carboxyl-terminal Myc-tagged Ptc-CTD (Ptcwith a truncation resulting in a 140-residue carboxyl-terminal deletion; amino acids nos. 1–1,291, a gift from M.P. Scott, StanfordUniversity) were inserted into pIND(Sp) vector (Invitrogen). To make stable cell lines, EcR-293 cells

SONIC HEDGEHOG PROTEIN SIGNALS NOT AS A HYDROLYTIC ENZYME BUT AS AN APPARENT LIGAND FOR PATCHED 10993

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 41: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

(Invitrogen) were transfected with recombinant constructs or empty vector, and several independent clones for each construct wereisolated.

Shh-N-Ptc-Binding Assay. Ptc expression was induced in cloned stable derivatives of the cell line EcR-293 by addition ofponasterone A (Invitrogen). After induction, 2.5×105 cells were mixed with increasing concentrations (0.1 nM-50 nM) of 32P-Shh-N(for Scatchard analyses) or with a fixed concentration (0.9 nM) of 32P-Shh-N and various concentrations of competitors (forcompetitive binding assays). After incubation at 4°C, cells were collected, and the bound 32P-Shh-N was determined. For the qualitativePtc-binding assay, QT6 cells transiently transfected with pRK5-Ptc-CTD were incubated with 2 nM Cy2-labeled Shh-N protein and160 nM unlabeled competitor. The ability of the unlabeled protein to compete for binding of Cy2-Shh-N protein to cells was directlyobserved by fluorescence microscopy.

Crosslinking of 32P-Shh-N to Ptc. Induced EcR-293 cells were incubated with the 32P-labeled Shh-N at a final concentration of 2nM at 4°C. Unlabeled Shh-N was added as competitor to 200 nM. After the cells were washed once with PBS, labeled Shh-N wascrosslinked to cells by adding freshly prepared disuccinimidyl suberate (Pierce) to 5 mM in PBS and incubating for 50 min. at 4°C.Crosslinked cells were washed with cold PBS and lysed in 0.15 mM NaCl/0.05 mM Tris-HCl, pH 7.2/1% Triton X-100/1% sodiumdeoxycholate/0.1% SDS (RIPA) buffer containing proteinase inhibitors. Lysate proteins were fractionated by SDS/PAGE (6%) andvisualized by staining with Coomassie blue. After the gel was dried, crosslinked products were visualized by autoradiography.

RESULTS

Zinc Hydrolase Activity Is Not Required for Shh-N Signaling. To determine whether Shh-N acts as an enzyme, glutamate-177(E177) and histidine-135 (H135) were substituted by alanine. E177 forms a hydrogen bond to a zinc-bound water molecule, and H135is positioned to stabilize a potential tetrahedral intermediate (Fig. 1A; ref. 23). By analogy with other zinc hydrolases, both residues arelikely to be essential for catalytic activity (28). Furthermore, the VanX protein, a structural homologue of Shh-N, displays a reductionin activity

FIG. 2. Signaling activities of Shh-N zinc hydrolase mutants. (A–C) Chicken intermediate neural plate explants doublestained for expression of the motor neuron marker Islet-1 (blue) and the floor plate marker HNF-3β (red). No Islet-1- orHNF-3β-positive cells were observed in control explants (A), whereas 5 nM (B) and 25 nM (C) concentrations of wild-typeShh-N induced expression of Islet-1 and HNF-3β, respectively. (D–L) Neural plate explants double stained with antibodiesagainst a dorsal marker Pax-7 (green) and the floor plate marker HNF-3β (red). Explants cultured with medium only (D)express Pax-7 but not HNF-3β. Wild-type Shh-N protein fully repressed expression of Pax-7 at 4 nM (E) and uniformlyinduced HNF-3β inallcells at 20 nM (F). The EH and EA mutant proteins repressed Pax-7 at 4 nM (G, I, respectively), albeitsomewhat less efficiently, and were able to uniformly induce HNF-3β expression at 20 nM (H and J, respectively). TheH135A (HA) mutant protein was indistinguishable from wild type (K, at 4 nM and L, at 20 nM). Images were captured usinga×40 objective.

Table 1. Properties of altered Shh-N proteinsProtein Mutation sites Pax-7 repression, nM HNF-3β induction,

nMPtc-CTD affinity,nM

5E1 IP Heparin binding

WT Wild type (a a 25–198)

~4 ≤20 0.48 ++ +

HA H135A ~4 ≥20 0.63 ND +EA E177A ~10 ≥20 1.7 ND +EH E177A, H135A ~10 ≥20 1.7 ND +SA K75A, E76A,

Y81A, D105A,N116A, E189A,K195A

~4 ≤20 0.66 ++ +

SB N51A, V52A,T56A, E168A

~4 ≥20 0.48 ++ +

SC P42A, K46A,R154A, S157A,S178A, K179A

�1,000 �1,000 �36 – –

SD E90A, D132A,E138A

~10 ≥20 0.84 ++ +

SE P42A, K46A ~20 ~100 2.4 – +SF R154A, S157A ~70 ≥100 9.1 + +SG S178A, K179A ~30 ~100 4.3 + +

Protein signaling was tested at initial concentrations of 4, 20,100, 500, and 1,000 nM and subsequently at 10 nM concentration intervals for EA, HA,EH, SD, SE, SF, and SG. The minimum concentration required for complete repression of Pax-7 and for uniform induction of HNF-3β is shown for eachprotein. As an indication of affinity for Ptc-CTD, binding coefficients (KI) for binding of mutant Shh-N proteins to Ptc-CTD were derived fromcompetitive binding experiments in Fig. 6 A and B by using the equation KI=[IC50]/(l+[L]/KL), where [IC50] is the concentration of unlabelled mutantproteins required for 50% competition. [L] is the concentration of unbound wild-type protein (32P-Shh-N) and KL is the dissociation constant for wild-type Shh-N. Immunoprecipitation by 5E1 monoclonal antibody (Fig. 7) and binding to heparin-agarose are indicated. ND, not determined.

SONIC HEDGEHOG PROTEIN SIGNALS NOT AS A HYDROLYTIC ENZYME BUT AS AN APPARENT LIGAND FOR PATCHED 10994

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 42: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

of more than six orders of magnitude on alteration of the R71 residue (29), which corresponds to H135 in Shh-N (25). Thesesubstitutions (E177A, H135A) were introduced individually and in combination into an E.coli expression vector, and purified alteredproteins were prepared (Fig. 1C).

We used a chicken intermediate neural plate explant culture system to test the signaling activity of recombinant proteins (12).Wild-type Shh-N protein applied to these explants induced motor neurons at 5 nM and predominantly floor plate cells at 25 nM, asmonitored by expression of Islet-1 and HNF-3β, respectively (Fig. 2 A–C). Shh-N protein at 4 nM sufficed for suppression of the dorsalmarker Pax-7 (Fig. 2 D and E). The concentrations of Shh-N required for these inductive events, although slightly higher thanpreviously reported (12, 15, 27), were reproducible in the assay protocol used in this study.

All three of the zinc hydrolase mutant Shh-N proteins tested, E177A (EA), H135A (HA), and the double mutant (EH), retained thecapacity to repress Pax-7 expression and to induce floor plate cells in the explants (Fig. 2 G–L). Whereas the EA and EH mutantproteins displayed slightly reduced signaling activity, the HA protein was indistinguishable from wild type (Fig. 2 G–L; Table 1).Because the altered residues are absolutely critical for catalytic activity in other zinc hydrolases (28, 29), retention of signaling activityby Shh-N hydrolase mutant proteins indicates that catalytic activity is not required for signaling. The reduced potency for EH and EAin signaling may reflect a destabilization of folded protein structure, as might be expected from substitution of Ala for the largelyburied side chains of the Glu-177 and His-135 residues. Indeed, the EA and EH altered proteins displayed a somewhat reduced affinityfor Ptc-CTD protein, which may account for their reduced potency, whereas HA was essentially indistinguishable from wild type(Table 1; see below).

FIG. 3. Direct binding of Shh-N to Ptc. (A) Ptc and Ptc-CTD expression in stably transfected cloned cell lines. Cell lysateswere prepared from stable EcR-293 cell lines carrying pIND(Sp) (empty vector control), pIND(Sp)-Ptc, or pIND(Sp)-Ptc-CTD, and proteins were fractionated by SDS/PAGE (6%) followed by blotting and detection with anti-Ptc antibody (SantaCruz Biotechnology). Two bands (dots) were detected in lysates from Ptc or from Ptc-CTD cells, but not from control cells.(B) Crosslinking of 32P-labeled Shh-N protein to Ptc and Ptc-CTD. EcR-293 cells expressing Ptc and Ptc-CTD wereincubated with 32P-Shh-N protein in absence (–) or presence (+) of a 100-fold excess of unlabeled Shh-N protein and thencrosslinked. Cell lysates were subjected to SDS/PAGE (6%) and crosslinked products detected by autoradiography.Autoradiographic images for control and Ptc and for Ptc-CTD are presented at distinct contrast settings to highlight thecrosslinked species. Migration of marker proteins (in kDa) is shown at left. (C, D) Scatchard analysis of the high-affinitycomponent of 32P-Shh-N binding to EcR-293 cells expressing Ptc (C) or Ptc-CTD (D). (E) Summary of predicted molecularmasses of Ptc and Ptc-CTD, experimental values estimated from Western blotting (A), and apparent masses of crosslinkedproducts (B). Experimental values are the average of several independent determinations. Also shown are estimates of thebinding coefficients of Shh-N for Ptc and for Ptc-CTD, and estimates of the number of binding sites per cell.

Direct Binding of Shh-N Protein to Ptc. Because the analyses above suggested a noncatalytic function of Shh-N protein, we nextfocused on Shh-N interaction with Ptc (21, 22). To determine whether Shh-N protein directly interacts with Ptc, we generated stablecloned EcR-293 cell lines for ecdysone-inducible expression of full length Ptc and Ptc-CTD (see Materials and Methods). Such stablecell lines, but not a control line carrying the empty vector, expressed Ptc and Ptc-CTD proteins when induced with the ecdysone analog,ponasterone A (Fig. 3A). On protein blots probed with anti-Ptc antibody, two broad bands were detected for Ptc (dots, 168 kDa and 157kDa) or for Ptc-CTD (dots, 163 kDa and 141 kDa). The estimated masses of the faster-migrating species were close to the molecularmasses predicted from primary sequence (159 kDa for Ptc and 144 kDa for Ptc-CTD) (Fig. 3E).

For sensitive detection of Shh-N binding to Ptc, a 32P-labeled Shh-N protein was prepared by introducing a protein kinase A(PKA) site at the carboxy terminus of Shh-N followed by labeling of the purified recombinant protein with PKA and [γ-32P]ATP (seeMaterials and Methods). Addition of this kinase site at the carboxy terminus did not affect signaling activity of Shh-N (data notshown). We performed crosslinking of 32P-labeled Shh-N protein to EcR-293 cells expressing Ptc or Ptc-CTD in the presence of abivalent crosslinker, disuccinimidyl suberate. As shown in Fig. 3B, crosslinked products were detected in lysates of Ptc and Ptc-CTDcells, but not in those of control cells. These crosslinked species were abolished by competition with unlabeled Shh-N protein (+ lanes),demonstrating a specific interaction. The crosslinked species form a single band, not two as detected in Western blotting, suggestingthat a particular form of Ptc or Ptc-CTD might bind to Shh-N. The estimated molecular masses of the crosslinked products (172 kDafor Ptc and 158 kDa for Ptc-CTD) differ by 14 kDa, which corresponds closely to the differences in mass between Ptc and Ptc-CTDand definitively indicates the participation of Ptc and Ptc-CTD in these complexes. The apparent masses of these complexesfurthermore are close to the sums of the masses of Shh-N plus Ptc or of Shh-N plus Ptc-CTD (178 kDa and 163 kDa, respectively)(Fig. 3E), suggesting a 1:1 stoichiometry of Ptc and Shh-N in these complexes. These results strongly suggest that Shh-N interacts

directly with Ptc protein.Quantitative analysis of 32P-Shh-N binding to these cells revealed a high-affinity Ptc-dependent component of binding that could

be competed by nanomolar concentrations of unlabeled Shh-N and a low-affinity component that was not dependent on Ptc expressionand that could not be competed by Shh-N. Scatchard analysis of the Ptc- and Shh-N-specific high affinity component (Fig. 3 C and D)indicated that the binding coefficients of Ptc and Ptc-CTD for 32P-Shh-N protein are similar (0.58 nM and 0.48 nM respectively;Fig. 3E) (21). Assuming, as argued above, that one Shh-N ligand binds to one Ptc molecule, the number of binding sites per cell for Ptc-CTD (210,000) is about 5.5 times higher than that for Ptc (38,000) (Fig. 3E). The temperature utilized in these binding studies (4°C) isnot permissive of endocytosis, indicating that Shh-N binding initially occurs on the cell surface, even though immunofluorescencestudies clearly demonstrate that Ptc and

SONIC HEDGEHOG PROTEIN SIGNALS NOT AS A HYDROLYTIC ENZYME BUT AS AN APPARENT LIGAND FOR PATCHED 10995

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 43: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Ptc-CTD proteins are predominantly localized inside cells (data not shown). The difference in number of binding sites for these twoproteins thus could be caused either by a higher degree of surface localization for Ptc-CTD or, alternatively, by a higher level of Ptc-CTD expression as compared with Ptc (Fig. 3A), a phenomenon also consistently observed in transiently transfected cells (data notshown). Thus we cannot at present distinguish whether the 140 residues absent from Ptc-CTD influence the subcellular localization ofthe Ptc protein or its steady-state levels within the cell.

FIG. 4. Alteration of Shh-N surface residues. (A) Ribbon diagram and (B, C) surface representations of Shh-N. B is shown inthe same orientation asy A, but C is rotated 180° about a vertical axis relative to A and B. Surface-exposed evolutionarilyconserved residues that were selected for alteration cluster into four major regions: SA (blue), SB (green), SC (red), and SD(yellow). Residues mutagenized are indicated in Table 1. (D) Coomassie blue stain of an SDS/PAGE separation of purifiedmutant Shh-N proteins. SE, SF, and SG denote proteins with distinct subsets of the altered residues in SC (see Table 1).Migration of molecular mass markers indicated at left (in kDa). Figs. 1B and 4A made with MOLSCRIPT (36); Fig. 4 B andC were made with GRASP (37).

The Role of Shh-N Surface Residues in Signaling and in Ptc Binding. Having demonstrated a direct and high-affinityinteraction between Ptc and Shh-N, we set out to determine the significance of this interaction by examining the correlation betweenPtc binding and signaling potency of altered Shh-N proteins. The Shh-N protein was subjected to systematic mutagenesis to identifysurface residues involved in signaling and potential ligand/receptor interactions. Because Hh proteins can act similarly across speciesand in distinct biological settings [Shh, for example, is active in Drosophila (30, 31) and distinct vertebrate proteins can act in commonpathways (32)], it seems likely that surface residues potentially important in inductive activities and ligand/receptor interactions wouldbe conserved. The Shh-N structure was used to identify surface residues based on degree of side chain exposure to solvent (23). Amongthese surface residues, those that are evolutionarily conserved were geographically divided into four major regions named SA, SB, SC,and SD (Fig. 4 A–C) and subjected to mutagenesis. We initially generated four mutant proteins, each containing multiple alaninesubstitutions at the conserved surface residues within each region (see Table 1). Because the side chains of the residues selected aresolvent exposed, we expected that the folded structures of these proteins would not be affected.

FIG. 5. Signaling activities of Shh-N proteins with altered surface residues. Neural plate explants stained for expression ofPax-7 (green) and HNF-3β (red). Explants were cultured in the presence of the indicated proteins at the indicatedconcentrations (nM). SA and SB proteins are as active as wild-type Shh-N, because they repress expression of Pax-7 at 4 nMand induce expression of HNF-3β at 20 nM. The SD protein is slightly less active than wild-type protein, and the SC mutantprotein is completely inactive. The SE and SG proteins display reduced activity, and the SF protein is even less active.Results are summarized in Table 1. Images were captured using a×40 objective.

The altered proteins were purified (Fig. 4D) and applied to chicken neural plate explant cultures. The SA and SB altered proteinsrepressed Pax-7 expression and induced floor plate cells in the explants as well as the wild-type protein (Fig. 5 A–D), and the SDaltered protein displayed an approximate 2.5-fold reduction in activity (Fig. 5 F–H). In striking contrast, no signaling activity of the SCmutant could be detected even at 1 µM, a concentration 250-fold higher than that required for Pax-7 repression by wild-type protein(Fig. 5E; results summarized in Table 1). We next examined Ptc binding for these altered proteins using a competition binding assay.The SA, SB, and SD mutant proteins competed with the 32P-labeled wild-type Shh-N protein for binding to Ptc-CTD expressing cellsas well or nearly as well as the wild-type protein (Fig. 6A), yielding similar binding coefficients (Table 1). Ptc-binding activity of theSC mutant, however, was not detectable (Fig. 6A; Table 1), suggesting a possible correlation between Ptc binding and signaling activityfor the Shh-N protein.

To explore this correlation further, we tested three additional proteins (SE, SF, and SG), each with alterations in two amino acidresidues that comprise distinct subsets of the six residues altered in SC (see Table 1). All three of these mutant proteins displayedsignaling activity in the explant culture assay, but only at significantly reduced levels. At 4 nM none of these three proteins repressedPax-7 (Fig. 5I; data not shown); at 20 nM the SE and SG proteins repressed Pax-7 almost completely or partially, respectively, but SFdid not (Fig. 5 J, L, and O). At 100 nM, the SE and SG proteins induced HNF-3β expression in most cells of the explant, but SF did soonly in a small number of cells (Fig. 5 K, M, and P). Further assays at concentration intervals of 10 nM pinpointed the minimalconcentrations required for Pax-7 repression, with values of �20, �30, and �70 nM for SE, SG, and SF, respectively (results inTable 1). Competition binding assays

SONIC HEDGEHOG PROTEIN SIGNALS NOT AS A HYDROLYTIC ENZYME BUT AS AN APPARENT LIGAND FOR PATCHED 10996

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 44: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

also revealed a significantly reduced affinity of the SE and SG proteins for Ptc-CTD, and an even lower affinity for the SF protein(Fig. 6B; Table 1). These results indicate that normal Ptc binding and neural plate signaling activities require distinct contributions frommultiple individual residues in the SC surface region. Furthermore, among proteins with alterations in distinct subsets of the SC mutantresidues, Ptc-binding affinity correlated extremely well with neural plate signaling activity (Fig. 6C).

FIG. 6. Binding of altered Shh-N proteins to Ptc. (A and B) Competition by altered proteins for binding of 32P-Shh-N toEcR-293 cells expressing Ptc-CTD. Binding of 32P-Shh-N in the presence of each altered protein at various concentrations isnormalized to the total value of 32P-Shh-N bound (approximately 35% of input) in the absence of competitor. The SC mutant,inactive in signaling, also fails to compete for binding to Ptc-CTD. The SE, SF, and SG proteins with intermediate levels ofsignaling activity, displayed intermediate levels of competition for binding to Ptc-CTD. Data are summarized in Table 1. (C)Signaling activity as a function of Ptc affinity. On the basis of neural plate signaling assays (Figs. 2, 5; Table 1), proteinconcentrations required for Pax-7 repression are plotted as a function of Ptc-binding affinity. The protein concentrations areplotted as ranges centered about the concentrations presented in Table 1. Note that there is an excellent correlation betweenPtc binding and activity in Pax-7 repression. The zinc hydrolase mutants EA, HA, and EH (Table 1) also corroborate thiscorrelation but are omitted for clarity.

We also purified Shh-N proteins with deletions of amino- or carboxyl-terminal residues and examined their activities qualitativelyin signaling and in Ptc binding (Table 2). An altered protein lacking nine amino-terminal residues (∆N34) displayed signaling and Ptc-binding activities indistinguishable from wild type. In contrast, ∆N50, which lacks 25 amino-terminal residues, completely lost bothactivities. We note that the residues deleted in ∆N50 include P42 and K46, which were altered in the SC and SE mutant proteins, andthat the mutant ∆N45 (lacking 20 amino-terminal residues), which does not contain P42, also lost signaling activity. The activitydefects in these proteins are more severe than those of the SE protein, suggesting that loss of these amino-terminal residues may havesome effect on the overall structure or stability of the Shh-N protein. A deletion mutant lacking residues from 166 to the carboxyterminus, ∆C165, had neither signaling nor Ptc-binding activities, and ∆C101 also lost signaling activity (Table 2). These deletions alsoremove residues that are altered in the SC protein (R154, S157, S178, and K179 in ∆C101; S178 and K179 in ∆C165), but again, thedeleted regions are sufficiently extensive that they would be expected to affect protein structure.

Table 2. Properties of Shh-N deletion derivativesProtein Residues present HNF-3β induction Ptc-CTD binding 5E1 IP Heparin bindingWT wild type (a a 25–198) + + + +∆N34 a a 34–198 + + + +∆N45 a a 45–198 – ND ND –∆N50 a a 50–198 – – – –∆C165 a a 25–165 – – – +∆C101 a a 25–101 – ND ND +

The properties of these mutant proteins were either indistinguishable or completely different from wild type in a qualitative neural plate assay forinduction of HNF-3β or in a qualitative Ptc-binding competition assay using QT6 cells transiently transfected with Ptc-CTD and Cy2-labeled Shh-N (seeMaterial and Methods). Immunoprecipitation by 5E1 monoclonal antibody and binding to heparin agarose are indicated. ND, not determined.

We note that inallof the altered proteins tested we failed to find a single example of a protein that retained signaling activity whilelosing the ability to bind Ptc. As seen in Fig. 6C, there is an excellent correlation between Ptc binding and signaling activity inallalteredproteins for which these properties can be measured, and these results strongly suggest that Ptc binding may be a critical requirementfor signaling.

Antibody Recognition and Heparin Binding of Altered Proteins. The monoclonal antibody 5E1, directed against Shh-N, blockssignaling in neural plate explants (14) (data not shown) and also blocks binding of the Shh-N protein to Ptc-expressing cells (data notshown). The reactivity of the 5E1 antibody with altered Shh-N proteins was examined by immunoprecipitation. We foundthatallproteins that retain signaling and Ptc-binding activities, including wild type, SA, SB, SD, and ∆N34, also retain full reactivitywith 5E1 (Fig. 7A; Tables 1, 2; data not shown). In contrast, the altered proteins SC, ∆N50, and ∆C165, which lost both signaling andPtc-binding activities, were not immunoprecipitated by 5E1 (Fig. 7A; Tables 1, 2; data not shown). Altered proteins with intermediatesignaling and Ptc-binding properties, such as SE, SF, and SG, displayed intermediate reactivities with 5E1 (Fig. 7B; Table 1).Reactivity of 5E1 with Shh-N proteins thus correlates well with Ptc binding and neural plate signaling activities.

Because 5E1 works well for immunoprecipitation and for immunocytochemistry but very poorly in Western analysis (data notshown), it appears to recognize an epitope present on the native Shh-N protein but not in denatured protein. The strong correlationbetween 5E1 binding, Ptc binding, and neural plate signaling furthermore suggests that the 5E1 epitope coincides with determinantsrequired for these activities. One possible explanation for the coordinate loss of signaling, 5E1 binding, and Ptc binding in the SCprotein is that the folded structure of this protein might be disrupted. Circular dichroism analysis, however, indicates that the secondarystructure profile of SC is similar to that of wild-type Shh-N (data not shown), suggesting that any disruption in folded structure must behighly local in nature. In addition, mutations in distinct subsets of the residues altered in SC display intermediate phenotypes,suggesting multiple independent contributions of individual residues in formation of the Ptc-interacting region of the protein surface.

SONIC HEDGEHOG PROTEIN SIGNALS NOT AS A HYDROLYTIC ENZYME BUT AS AN APPARENT LIGAND FOR PATCHED 10997

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 45: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

FIG. 7. Reactivity of 5E1 antibody with altered Shh-N proteins. After immunoprecipitation (IP) with the 5E1 monoclonalantibody, altered proteins were detected by Western blotting using a polyclonal antibody. The starting (input) and precipitated(IP) material are shown for each protein. (A) The wild-type, SA, SB, and SD altered proteins were precipitated well by the5E1 antibody, but the SC protein was not. (B) SE, SF, and SG displayed intermediate reactivity with the 5E1 antibody.

The Shh-N protein also binds to heparin [(12); data not shown], and the crystal structure contains a sulfate anion at a location nearthe SC region (23). In addition, recent evidence suggests that tout velu, a Drosophila gene whose mammalian homologues function inthe polymerization of glycosamines for synthesis of heparan sulfate proteoglycans (33, 34), plays a role in the reception and transportof the Hh signal (35). We therefore tested whether the alterations in our proteins affect their ability to bind to heparin agarose. As seenin Tables 1 and 2, only three of the proteins tested, SC, ∆N45, and ∆N50, lost the ability to bind heparin agarose, and these threeproteins are completely inactive in Ptc binding and signaling. Some of the proteins that lose signaling and Ptc-binding activity retainedthe ability to bind heparin, indicating that heparin binding is not sufficient for Ptc binding and for signaling. Our data, however, wouldbe consistent with the idea that heparin binding may be necessary for Ptc binding and for signaling.

DISCUSSION

The Putative Zinc Hydrolase in Shh-N. Alterations in residues that should be critical for the putative zinc hydrolase activity ofShh-N did not disrupt its ability to induce ventral neural cell types or to suppress dorsal markers, suggesting that catalytic activity is notrequired for Shh signaling in the neural plate. Although residues constituting the putative zinc hydrolase active site are widelyconserved among Hh family members, they are not fully conserved in Drosophila (23), suggesting that hydrolase function is notrequired for signaling in this organism. We also introduced and ectopically expressed the E177A and EH mutant Shh constructs intoDrosophila and compared their ability to mispattern the embryonic cuticle with that of wild-type Shh (30) and could detect nosignificant difference between them (H.E.F.Takahashi and P.A.B., unpublished data), further substantiating dispensability of catalyticactivity for Shh-N signaling function in the context of developing Drosophila embryos. Furthermore, experiments with mutant proteinsexpressed in cultured cells suggest that the putative hydrolase activity is not required for the normal biogenesis and processing of Shh,nor for its normal state of modification (data not shown). We also note that we failed to detect any hydrolase activity of Shh-N inbiochemical assays with a variety of substrates, including some like those for D,D-carboxypeptidase, which contained D-amino acidresidues.

The putative zinc hydrolase of Shh-N has thus resisted our attempts to reveal an activity, either in biochemical or in in vitro or invivo signaling assays, raising the possibility that the putative catalytic site represents an evolutionary vestige of its common ancestrywith the D,D-carboxypeptidase family of proteins. In this view, the zinc atom may have lost its ancestral role in catalysis but could haveretained a role in stabilizing protein structure through interactions with the side chains of coordinating residues. The lack ofconservation of coordinating residues in the Drosophila protein may indicate a replacement of these interactions by other stabilizinginteractions. General dispensability of hydrolase activity in Hh signaling is consistent with the importance of surface residuesconserved among Hh proteins for binding to Ptc and for signaling (see below). Alternatively, it is possible that Shh-N hydrolase retainsa role not detected by our biochemical or in vitro and in vivo signaling assays. Such a role likely would be modulatory in nature, giventhe essentially normal signaling activity of hydrolase mutant proteins, and its discovery may require targeted recombination tomutagenize the endogenous mouse Shh gene.

Direct Binding of Shh-N to Ptc and Activation of the Shh Pathway. Previous genetic and biochemical studies are consistentwith the idea that Ptc may function as a Hh receptor. The biochemical analyses demonstrated that Shh-N protein binds to Ptc-expressing cells, that Ptc is coimmunoprecipitated with Shh-N and vice versa, and that Ptc can be crosslinked in a complex containingShh-N (21, 22). Because the composition of the crosslinked complexes was not characterized, however, these studies could not excludethe possibility that instead of binding directly to Ptc, Shh-N may bind to another component of a complex that includes Ptc. Thesebiochemical studies also did not examine the role of such an interaction in the activation of the Shh pathway. The latter is a particularlysignificant issue given the genetically demonstrated role of Ptc in sequestration of Hh protein to restrict its movement withinDrosophila tissues (20).

In addressing these questions, we have identified a crosslinked product containing radiolabeled Shh-N that is specificallycompeted by unlabeled Shh-N but not by the unlabeled mutant SC protein. This crosslinked complex also contains Ptc because itsformation depends on Ptc expression and because it displays an apparent molecular mass difference that corresponds closely to thedifferences between full-length Ptc or Ptc-CTD. Finally, for both Ptc molecules, the apparent mass of the complex is close to the sumof Shh-N plus Ptc. Thus, although direct binding ideally would be demonstrated by studies with purified components, the properties ofour crosslinked complexes strongly suggest a direct association between Shh-N and Ptc with a probable stoichiometry of 1:1. Given thepossible anomalies in migration of such crosslinked species, we cannot rule out the possibility that more than one Shh-N molecule ispresent in these complexes, nor can we distinguish between the participation of the slower- or faster-migrating Ptc forms, whichprobably differ in their glycosylation (22). Although it has been reported that Ptc interacts with Smo independently of Shh-N (21), theapparent masses of our crosslinked products would appear to exclude Smo, which has a predicted mass of 87 kDa (21). It is possiblethat the cells we utilized do not express Smo endogenously or that high-level expression of Smo is required for formation of a complexwith Ptc/Shh. Alternatively, our experimental conditions for crosslinking might disrupt Ptc-Smo interaction or fail to capture Smoprotein.

We also have identified, using altered Shh-N proteins, the region of the Shh-N protein surface that is involved in Ptc binding andhave used these altered proteins to show that neural plate signaling activity is retained in proportion to the binding affinity for Ptc.Thus, the extensive alterations in surface residues of the SA, SB, and SD proteins do not affect

SONIC HEDGEHOG PROTEIN SIGNALS NOT AS A HYDROLYTIC ENZYME BUT AS AN APPARENT LIGAND FOR PATCHED 10998

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 46: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

or only mildly affect Ptc binding, and these proteins retain normal or nearly normal signaling activity. At the other extreme, the SCaltered protein displays a complete loss of Ptc binding, and this is reflected in a complete loss of neural plate signaling activity. Evenmore telling, proteins carrying distinct subsets of the residues altered in SC result in intermediate levels of Ptc-binding activity andcorresponding intermediate levels of neural plate signaling activity (Fig. 6C). Multiple individual residues within the SC surface regionthat contribute independently to Ptc binding thus also similarly contribute to signaling potency. Although our results cannot exclude arole for interactions with other proteins in reception of the Shh signal, they do strongly suggest that direct binding to Ptc is a criticalstep, and this information will serve as the basis for further elucidation of downstream events.

We thank Drs. M.Scott and T.Jessell for various cDNAs and antibodies. We also thank J.Wrable for help with circular dichroismanalysis, Dr. J.Taipale for suggestions on the crosslinking experiment, Dr. M.Cooper and K.Young for preparation of somerecombinant proteins, Dr. O.Sundin for help with chicken embryo experiments, Dr. R.Mann for comments on the manuscript, andmembers of the Beachy laboratory for discussions and suggestions. N.F. was a postdoctoral fellow of the Human Frontier ScienceProgram. D.J.L. and P.A.B. are investigators of Howard Hughes Medical Institute. This work was supported in part by grants from theAmerican Paralysis Association and the Ara Parseghian Medical Research Foundation.

Under a licensing agreement between Ontogeny, Inc. and the Johns Hopkins University, Dr. Beachy and the University holdequity in Ontogeny and are entitled to a share of royalties from sales of products related to the research described in this article. Dr.Beachy serves on Ontogeny’s Scientific Advisory Board and as a consultant to the company. All financial aspects of thesearrangements are managed by the University in accordance with its policies.1. Perrimon, N. (1995) Cell 80, 517–520.2. Hammerschmidt, M., Brook, A. & McMahon, A.P. (1997) Trends Genet. 13, 14–21.3. Goodrich, L.V. & Scott, M.P. (1998) Neuron 21, 1243–1257.4. Chiang, C., Litingtung, Y., Lee, E., Young, K., Corden, J.L., Westphal, H. & Beachy, P. (1996) Nature (London) 383, 407–413.5. Beachy, P.A., Cooper, M.K., Young, K.E., von Kessler, D.P., Park, W.J., Hall, T.M., Leahy, D.J. & Porter, J.A. (1997) Cold Spring Harbor Symp.

Quant. Biol. 62, 191–204.6. Lee, J.J., Ekker, S.C., von Kessler, D.P., Porter, J. A, Sun, B.I. & Beachy, P.A. (1994) Science 266, 1528–1537.7. Porter, J. A, von Kessler, D.P., Ekker, S.C., Young, K.E., Lee, J.J., Moses, K. & Beachy, P.A. (1995) Nature (London) 374, 363–366.8. Porter, J.A., Young, K.E. & Beachy, P.A. (1996) Science 274, 255–259.9. Pepinsky, R.B., Zeng, C., Wen, D., Rayhorn, P., Baker, D.P., Williams, K.P., Bixler, S.A., Ambrose, C.M., Garber, E.A., Miatkowski, K., et al (1998)

J. Biol. Chem. 237, 14037–14045.10. Porter, J.A., Ekker, S.C., Park, W.-J., von Kessler, D.P., Young, K.E., Chen, C.-H., Ma, Y., Woods, A.S., Cotter, R.J., Koonin, E.V., et al (1996)

Cell 86, 21–34.11. Tanabe, Y. & Jessell, T.M. (1996) Science 274, 1115–1123.12. Roelink, H., Porter, J.A., Chiang, C., Tanabe, Y., Chang, D.T., Beachy, P.A. & Jessell, T.M. (1995) Cell 81, 445–455.13. Tanabe, Y., Roelink, H. & Jessell, T. (1995) Curr. Biol. 5, 651–658.14. Ericson, J., Morton, S., Kawakami, A, Roelink, H. & Jessell, T.M. (1996) Cell 87, 661–673.15. Ericson, J., Rashbass, P., Schedl, A., Brenner-Morton, S., Kawakami, A., van Heyningen, V., Jessell, T.M. & Briscoe, J. (1997) Cell 90, 169–180.16. Ingham, P.W., Taylor, A.M. & Nakano, Y. (1991) Nature (London) 353, 184–187.17. Ingham, P.W. & Hidalgo, A. (1993) Development (Cambridge, U.K.) 117, 283–291.18. Tabata, T. & Kornberg, T.B. (1994) Cell 76, 89–102.19. Ingham, P.W. (1993) Nature (London) 366, 560–562.20. Chen, Y. & Struhl, G. (1996) Cell 87, 553–563.21. Stone, D.M., Hynes, M., Armanini, M., Swanson, T.A., Gu, Q., L, J.R., Scott, M.P., Pennica, D., Goddard, A., Phillips, H., Noll, M.,, et al (1996)

Nature (London) 384, 129–134.22. Marigo, V., Davey, R. A, Zuo, Y., Cunningham, J.M. & Tabin, C.J. (1996) Nature (London) 384, 176–179.23. Hall, T.M.T., Porter, J.A., Beachy, P.B. & Leahy, D.J. (1995) Nature (London) 378, 212–216.24. Dideberg, O., Charlier, P., Dive, G., Joris, B., Frere, J.M. & Ghuysen, J.M. (1982) Nature (London) 299, 469–470.25. Bussiere, D.E., Pratt, S.D., Katz, L., Severin, J.M., Holzman, T. & Park, C.H. (1998) Mol. Cell 2, 75–84.26. Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A. & Struhl, K. (1994) Current Protocols in Molecular Biology

(Wiley, New York).27. Cooper, M.K., Porter, J.A., Young, K.E. & Beachy, P.A. (1998) Science 280, 1603–1607.28. Christianson, D.W. (1991) Adv. Protein Chem. 42, 281–355.29. Lessard, I.A. & Walsh, C.T. (1999) Chem. Biol. 6, 177–187.30. Chang, D.T., Lopez, A, von Kessler, D.P., Chiang, C., Simandl, B.K., Zhao, R., Seldin, M.F., Fallen, J.F. & Beachy, P.A. (1994) Development

(Cambridge, U.K.) 120, 3339–3353.31. Krauss, S., Concordet, J.-P. & Ingham, P.W. (1993) Cell 75, 1431–1444.32. Ekker, S.C., McGrew, L.L., Lai, C.-J., Lee, J.J., von Kessler, D.P., Moon, R.T. & Beachy, P.A. (1995) Development (Cambridge, U.K.) 121, 2337–

2347.33. McCormick, C., Leduc, Y., Martindale, D., Mattison, K., Esford, L.E., Dyer, A.P. & Tufaro, F. (1998) Nat. Genet. 19, 158–161.34. Lind, T., Tufaro, F., McCormick, C., Lindahl, U. & Lidholt, K. (1998) J. Biol. Chem. 273, 26265–8.35. Bellaiche, Y., The, I. & Perrimon, N. (1998) Nature (London) 394, 85–88.36. Kraulis, P.J. (1991) J. Appl. Crystallogr. 24, 946–950.37. Nicholls, A., Sharp, K.A. & Honing, B. (1991) Proteins 11, 281–296.

SONIC HEDGEHOG PROTEIN SIGNALS NOT AS A HYDROLYTIC ENZYME BUT AS AN APPARENT LIGAND FOR PATCHED 10999

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 47: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Structure-assisted design of mechanism-based irreversibleinhibitors of human rhinovirus 3C protease with potent antiviral

activity against multiple rhinovirus serotypes

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation,” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

D. A. MATTHEWS*, P. S. DRAGOVICH, S. E. WEBBER, S. A. FUHRMAN, A. K. PATICK, L. S. ZALMAN, T. F. HENDRICKSON, R. A. LOVE, T.J. PRINS, J. T. MARAKOVITS, R. ZHOU, J. TIKHE, C. E. FORD, J. W. MEADOR, R. A. FERRE, E. L. BROWN, S. L. BINFORD, M. A. BROTHERS, D.M. DELISLE, AND S. T. WORLAND

Agouron Pharmaceuticals, Inc., 3565 General Atomics Court, San Diego, CA 92121ABSTRACT Human rhinoviruses, the most important etiologic agents of the common cold, are messenger-active single-

stranded monocistronic RNA viruses that have evolved a highly complex cascade of proteolytic processing events to controlviral gene expression and replication. Most maturation cleavages within the precursor polyprotein are mediated by rhinovirus3C protease (or its immediate precursor, 3CD), a cysteine protease with a trypsin-like polypeptide fold. High-resolution crystalstructures of the enzyme from three viral serotypes have been used for the design and elaboration of 3C protease inhibitorsrepresenting different structural and chemical classes. Inhibitors having α,β-unsaturated carbonyl groups combined withpeptidyl-binding elements specific for 3C protease undergo a Michael reaction mediated by nucleophilic addition of theenzyme’s catalytic Cys-147, resulting in covalent-bond formation and irreversible inactivation of the viral protease. Directinhibition of 3C proteolytic activity in virally infected cells treated with these compounds can be inferred from dose-dependentaccumulations of viral precursor polyproteins as determined by SDS/PAGE analysis of radiolabeled proteins. Cocrystal-structure-assisted optimization of 3C-protease-directed Michael acceptors has yielded molecules having extremely rapid in vitro inactivation of the viral protease, potent antiviral activity against multiple rhinovirus serotypes and low cellular toxicity.Recently, one compound in this series, AG7088, has entered clinical trials.

Picornaviruses are small nonenveloped RNA viruses with a single strand of messenger-active genomic RNA 7,500–8,000nucleotides in length, which is replicated in the cytoplasm of infected cells. The family currently is divided into six genera with similargenetic organization and translational strategies. Among its members are several important human and veter-inary pathogens, includingpoliovirus and coxsackievirus (Enterovirus), foot-and-mouth disease virus (Aphthovirus), encephalomyocarditis virus (Cardiovirus),hepatitis A virus (Hepatovirus), and human rhinoviruses (Rhinovirus). As a consequence of limitations imposed by a smallmonocistronic RNA viral genome, picornaviruses depend on a strategy for temporal gene expression that includes highly controlledcotranslational and posttranslational processing of a precursor polyprotein by virally encoded proteases to generate the individualstructural and nonstructural proteins needed for viral replication. While still in the process of synthesis, the polyprotein is cleavedproteolytically by the virally encoded 2A protease to release P1, the precursor to capsid proteins, from P2–P3. Subsequent processingof P1 to 1AB, 1C, and 1D and all P2 and P3 processing to release proteins needed for RNA replication depend on viral 3C proteaseactivity (1–3).

In addition to its role in polyprotein processing, picornavirus 3C sequences are involved in proteolytic degradation of specificcellular proteins associated with host-cell transcription and in direct binding to viral RNA as part of a replication complex required forsynthesis of plus-strand viral RNA (4–7).

Rhinoviruses are primary causative agents of the common cold. Whereas these infections are usually mild and self-limiting,consequences can be more severe for the elderly, for immune-compromised individuals, and for those predisposed to respiratory illnesssuch as asthma (8). In the case of picornaviruses with limited serotypic diversity, such as poliovirus, foot-and-mouth disease virus, andhepatitis A virus, highly protective vaccines have been developed that are in use worldwide. On the other hand, developing effectiveimmunizations against rhinovirus infections or against the pathogenic nonpolio enteroviruses is anticipated to be more challenging,owing to the large number of existing serotypes: at least 100 rhinoviruses and 65 enteroviruses. In an attempt to address this need, wehave undertaken a program directed at discovering rhinovirus 3C protease inhibitors with antiviral activity against the spectrum ofknown rhinovirus serotypes. The results of these efforts and the identification of an antirhinoviral compound now entering clinical trialsare described below.

Picornaviral 3C Proteases

Picornaviral 3C proteases are small monomeric proteins with molecular masses around 20 kDa. Crystal structures exist for 3Cproteases from type 14 human rhinovirus (9), hepatitis A (10), and poliovirus (11). Viral 3C proteases fold into two topologicallyequivalent six-stranded β-barrels with an extended shallow groove for substrate binding located between the two domains. In rhinovirus3C protease, the catalytically important residues Cys-147, His-40, and Glu-71 form a linked cluster of amino acids with an overallgeometry similar to the Ser-His-Asp catalytic triad found in the trypsin-like family of serine proteases. The highly conserved sequenceGly-X-Cys-Gly-Gly in viral 3C proteases serves to position Cys-147 for nucleophilic attack on the substrate’s carbonyl carbon and toorient backbone NH groups of Gly-145 and Cys-147 to form an “oxyanion hole” for stabilization of a tetrahedral transition state (9).Thus, the catalytic machinery for activation of the

*To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviation: CBZ, benzyloxycarbonyl.Data deposition: The atomic coordinates have been deposited in the Protein Data Bank, www.rcsb.org (PDB code 1CQQ).

STRUCTURE-ASSISTED DESIGN OF MECHANISM-BASED IRREVERSIBLE INHIBITORS OF HUMAN RHINOVIRUS 3CPROTEASE WITH POTENT ANTIVIRAL ACTIVITY AGAINST MULTIPLE RHINOVIRUS SEROTYPES

11000

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 48: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

attacking nucleophile and stabilization of a tetrahedral intermediate-transition state in 3C proteases closely resembles that of trypsin-like serine proteases, suggesting that the viral 3C proteases are related mechanistically to serine proteases rather than to the papain-likecysteine proteases. Picornaviral 3C proteases process a limited number of cleavage sites in the virally encoded polyprotein. Mostcleavages occur between Gln-Gly peptide bonds with distinct differences in the efficiency of cleavage at various junction sites.Recombinant rhinovirus 3C protease has an absolute requirement for Gln-Gly cleavage junctions in peptide substrates ranging from 7to 11 aa in length (12).

Inhibitors of 3C Protease and the Issue of Serotypic Diversity Among Rhinoviruses

Picornaviral 3C proteases represent a unique class of enzymes that integrate characteristics of both serine and cysteine proteaseswith an unusual specificity for Gln-Gly cleavage junctions. The absence of known cellular homologues contributes to interest in 3Cprotease as a potentially important target for antiviral drug design. However, the vast serotypic diversity among rhinoviruses raises thequestion of whether or not a single agent can effectively target 3C proteases from the 100 or so rhinovirus serotypes capable ofinfecting humans. Primary sequence data are available for 3C proteases from 10 different rhinovirus serotypes, including the type 2 andtype 14 enzymes that have less than 50% amino acid identity.

To address these diversity concerns before initiating a concerted drug-discovery effort, we undertook a program to obtainstructural information on peptide-based inhibitors bound to 3C proteases from multiple rhinovirus serotypes. We wanted to identify thegeometric and electronic factors that modulate protein/substrate (inhibitor) recognition, the extent to which specific residues that formthe substrate (inhibitor) binding site of 3C protease are conserved across rhinovirus serotypes, and whether or not these binding-siteresidues are arranged similarly in 3C proteases from different virus serotypes.

Peptide Aldehydes Bound to Serotype 2 Rhinovirus 3C Protease. Peptide aldehydes have been used extensively as inhibitors ofserine and cysteine proteases, although they typically have not proven effective as drug candidates because of their poorpharmacological properties. They bind as reversible adducts in which the nucleophilic cysteine or serine makes a covalent bond withthe carbonyl carbon of the aldehyde, forming a stable tetrahedral species. Short peptidic aldehydes having sequences similar tocanonical 3C protease cleavage sites have been reported as inhibitors of both rhinovirus and hepatitis A viral proteases (13–15). Thecombination of glutamine at P1 with aldehyde functionality causes cyclization on the aldehyde. (16). To circumvent this problem,replacements for the γ-carboxamide were sought that prevent internal cyclization but retain high affinity for the 3C protease S1specificity pocket (15). Compound I (Fig. 1) is an N-terminal protected tripeptide aldehyde in which the -CH2C(O)NH2 of Gln isreplaced with an N-acetyl isostere. Compound I is a 6-nM inhibitor of type 14 human rhinovirus 3C protease. Whereas the original x-ray structural studies of rhinovirus 3C protease were performed by using the serotype 14 enzyme (9), subsequent analysis of inhibitorbinding was carried out mainly with type 2 3C protease, both because of the relative ease in obtaining cocrystals and their generallysuperior diffraction properties. Fig. 2 shows the 2.2-Å x-ray structure of compound I complexed with serotype 2 rhinovirus 3C protease(15).

The peptide aldehyde I binds to rhinovirus 3C protease in a partially extended conformation with inhibitor backbone atoms alignedfor antiparallel β-sheet-type hydrogen bonding with an exposed β-strand (βE2) of the protein comprising residues 162–165. Theinhibitor’s P1 side chain lies in a shallow pocket bounded by βE2, by residues 142–144, and by His-161, the last of which donates ahydrogen bond to the N-acetyl oxygen. This oxygen accepts a second hydrogen bond from the side-chain hydroxyl of Thr-142. Theinhibitor’s acetyl methyl group is close to the backbone carbonyl of Thr-142 (3.3 Å), suggesting that substrates or inhibitors having asimilarly positioned P1 glutamine-like side chain could form a third hydrogen bond to enhance specific recognition of a γ-carboxamidegroup.

FIG. 1. Rhinovirus 3C protease inhibitors. Ki, inhibition constant; kobs, observed rate of inactivation; I, inhibitor concentration.

The P1 backbone amide makes a weak (3.2-Å) hydrogen bond with the carbonyl oxygen of Val-162. The deep S2 pocket easilyaccommodates the inhibitor’s bulky P2 Phe side chain, which is bounded on one side by the side chain of His-40 and on the other sideby residues 127–130. Two ordered water molecules reside at the back of the S2 pocket. The side-chain hydroxyl and backbone NH ofSer-128 form hydrogen bonds with the inhibitor’s P2 NH and the carbonyl oxygen of the terminal benzyloxycarbonyl (CBZ) group,respectively. Two main-chain hydrogen bonds tether the inhibitor’s P3 Leu to backbone atoms of Gly-164, whereas the isobutyl sidechain is mostly solvent-exposed. The benzyl portion of the CBZ group packs into a shallow hydrophobic pocket that probablyaccommodates a substrate’s P4 side chain. The side chain of Asn-165 is positioned directly above the benzene of CBZ, with itscarboxamide NH pointing into the face of the aromatic ring, suggesting that some additional binding energy probably derives from thisfavorable amino-aromatic interaction (17).

The affinity of peptide aldehyde inhibitors for trypsin-like serine proteases has been attributed to their ability to form, with theactive-site serine, hemiacetals that resemble the transition state in amide hydrolysis, with the oxyanion stabilized in a structurallyconserved oxyanion hole. Considering the structural homology between 3C protease and trypsin-like serine proteases, we anticipatedthat the tetrahedral hemithioacetal oxygen of compound I when bound to 3C would be positioned similarly within the oxyanion hole.Indeed, we showed previously that 2,3-dioxindole inhibitors (see Fig. 1, compound II) form stable tetrahedral adducts with 3C pro-

STRUCTURE-ASSISTED DESIGN OF MECHANISM-BASED IRREVERSIBLE INHIBITORS OF HUMAN RHINOVIRUS 3CPROTEASE WITH POTENT ANTIVIRAL ACTIVITY AGAINST MULTIPLE RHINOVIRUS SEROTYPES

11001

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 49: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

tease in which the O3 oxygen is stabilized in just this manner (18). However, compound I binds in a non-transition-state conformationwith the oxygen of the hemithioacetal stabilized by hydrogen bonding to Nε2 of His-40.

FIG. 2. Compound I bound to serotype 2 human rhinovirus 3C protease. The protein is rendered as a semitransparent solvent-accessible surface with associated protein backbone and side-chain atoms colored pink. Catalytic triad residues are blue. Redspheres represent ordered solvent molecules. Inhibitor atoms are colored green for carbon, blue for nitrogen, and red foroxygen. The inhibitor carbon covalently bonded to Cys-147 is highlighted in light green.

Compared with 3C protease complexes with 2,3-dioxindole inhibitors, the complex with compound I also differs in the main-chainconformation for protein residues 144–145. The peptide linkage joining Ser-144 and Gly-145 flips around so that NH (residue 145),instead of pointing into the oxyanion hole, is directed out toward solvent where it hydrogen bonds with an ordered water molecule. Thisstructure suggests that, for 3C, optimum alignment of NH dipoles to form a classically configured oxyanion hole analogous to that seenin trypsin-like serine proteases may not occur in the native protein but rather requires a conformational change induced by substrate (orinhibitor) binding.

The Extended Substrate (Inhibitor) Binding Site for Rhinovirus 3C Protease Is Highly Conserved Among Different ViralSerotypes. High-resolution x-ray crystal structures for serotype 2 and serotype 16 3C proteases (overall amino acid sequence identityof 80%) bound to various peptide-based aldehyde inhibitors reveal that the two respective active sites are nearly identical (D.A.M.,unpublished results). Not only do protein backbone atoms superpose within experimental error (<0.3 Å), but amino acid side chainsinteracting with peptide aldehyde inhibitors are identically conserved and oriented similarly in the complexes, except at position 130.Even for the more distantly related rhinovirus serotypes, there is a high level of amino acid identity for 3C protease residues thatmodulate binding of peptide aldehyde inhibitors such as compound I. There are 21 residues in serotype 2 3C protease that interactdirectly with the bound inhibitor. Of these, 17 are identically conserved in the 10 3C proteases of known sequence from differentrhinovirus serotypes. For three of the nonconserved residues (residues 126, 144, and 146), interactions with the bound inhibitor aremodulated by peptide backbone atoms only, suggesting that side-chain variation at these positions may not affect inhibitor bindingsignificantly. Only in the case of residue 130 is there a nonconserved amino acid with a side chain directly contacting compound I.Residue 130 is either Asn or Thr in the 10 known rhinovirus 3C protease sequences. In the type 2 enzyme, Asn-130 is positioned at theback of the S2 specificity pocket where its side chain is in van der Waals contact with the inhibitor’s P2 benzyl group (Fig. 2). Nearby,but not directly contacting the P2 Phe, is a second nonconserved residue at position 69 (Lys or Asn, depending on serotype) thathydrogen bonds to ordered water molecules at the back of the S2 pocket. In summary, the available crystallographic and amino acidsequence data suggest that inhibitors of rhinovirus 3C protease could be expected to show efficacy against the enzyme from multipleviral serotypes provided they do not depend on binding determinants at the back of the S2 specificity pocket where structural variabilitybetween serotypes may be most pronounced.

Strategies for Rhinovirus 3C Protease Inhibitor Design. Several considerations come into play when developing strategies fordesign of therapeutically efficacious serine and cysteine protease inhibitors. For many of. these proteins, specificity pockets forsubstrate (or inhibitor) recognition are shallow, and binding determinants are widely dispersed over

STRUCTURE-ASSISTED DESIGN OF MECHANISM-BASED IRREVERSIBLE INHIBITORS OF HUMAN RHINOVIRUS 3CPROTEASE WITH POTENT ANTIVIRAL ACTIVITY AGAINST MULTIPLE RHINOVIRUS SEROTYPES

11002

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 50: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

large surface areas. Difficulties inherent in discovering small molecules with high affinity for such binding sites are in many respectsanalogous to those encountered in attempting to disrupt protein-protein interactions with small effector molecules. Serine proteasessuch as factor Xa and thrombin, proteins involved in the blood-coagulation pathway with deep well defined S1 specificity pockets,have been targeted effectively with structurally diverse, small, noncovalent inhibitors and thus are exceptions to this generalization(19). However, for virally encoded serine and cysteine proteases of known structure, such as the herpes family of serine proteases,hepatitis C NS3 protease, and picornavirus 3C proteases, the fact that substrate recognition is modulated by extensive protein-proteininteractions represents a significant impediment for design of specific inhibitors. We know that inhibitor potency can be enhanced bytaking advantage of the possibility for covalent adduct formation afforded by the presence of a reactive serine or cysteine at the activesites of these proteases. In the case of 3C, these effects are dramatic. Whereas compound I has a Ki of 6 nM against the serotype 14enzyme, reduction of the aldehyde functionality to the corresponding alcohol yields a molecule with no measurable inhibition atconcentrations as high as 100 µM (15). An optimized 9-aa substrate for 3C has a Km of only 400 µM, showing weak binding to thisprotease even for relatively large peptide substrates (12).

Not surprisingly, in light of these results, we have had little success identifying small noncovalent inhibitors of 3C protease. Thealternative approach of incorporating specific noncovalent recognition plus an electrophile that can react covalently with the active sitenucleophile is conceptually attractive. However, potency and the inherent chemical reactivity of the electrophilic center are usuallycorrelated. Highly reactive electrophiles are likely to target nonselectively other cellular proteins and nonenzymatic biologicalnucleophiles, such as glutathione, rendering such agents unacceptable as drug candidates. In earlier work, we reported on the design ofpotent reversible 3C protease inhibitors based on a 2,3-dioxindole (isatin) core (18). When elaborated with substituents providingrecognition in the S1 and S2 specificity pockets of 3C protease, inhibitors with low nanomolar Ki were obtained. An x-ray cocrystalstructure of compound II revealed covalent attachment of Cys-147 to the electrophilic center (C2) with the carboxamide andbenzothiophene groups positioned as expected in the S1 and S2 pockets (18). Unfortunately, all isatin inhibitors tested were devoid ofantiviral activity and/or were toxic, properties most probably attributable to their high electrophilic reactivity. These findings led us toconsider other types of covalent inhibitors where the chemical reactivity of the electrophilic center can be more effectively modulatedin the context of molecules having high specificity for 3C protease.

Irreversible Michael Acceptors as Inhibitors of 3C Protease

Peptidic substrates in which the scissile amide carbonyl is replaced by a Michael acceptor were first introduced as specificirreversible inhibitors of the cysteine protease papain by Hanzlik and coworkers (20, 21). We reasoned that, although this reaction isprobably facilitated by the especially nucleophilic thiolateimidazolium ion pair in papain-like cysteine proteases, suitably activatedMichael acceptors might also undergo addition by the presumably less nucleophilic catalytic cysteine of 3C. A trans-α, β-unsaturatedethyl ester incorporated into a CBZ protected tripeptide corresponding to the N-terminal portion of a canonical 3C protease cleavagesequence (Fig. 1, compound III) afforded a compound with relatively potent irreversible inhibition of 3C (22). The compound hadmoderate antiviral activity in HeLa cells infected with rhinovirus serotype 14, was nontoxic to the limit of its solubility, and was notinactivated by short exposure to DTT. These results encouraged us to initiate additional studies of Michael acceptors to enhance theiractivity against 3C protease further.

Fig. 3 shows the 2.3-Å x-ray structure of compound III bound to serotype 2 3C protease. The peptidic portion of the moleculeclosely resembles that of the aldehyde I and binds similarly to the enzyme active site (24). Unlike compound I, the P1 side chain ofcompound III is identical to that for Gln, the P1 residue in the vast majority of 3C cleavage sequences. The carboxamide oxygenaccepts hydrogen bonds from the side chains of His-161 and Thr-142, and the amide nitrogen donates hydrogen bonds to the backbonecarbonyl oxygen of Thr-142 and to an ordered water molecule. Thus, all possible hydrogen bonding interactions for a Gln side chainare fully satisfied within the complimentary S1 binding site. The geometrical specificity conferred by these highly directional hydrogenbonds is important in orienting the inhibitor’s vinyl group (or in the case of a substrate, the susceptible carbonyl carbon) fornucleophilic attack by Cys-147. Cys-147 is covalently linked to the inhibitor’s electrophilic β-carbon with the carbonyl oxygen of theethyl ester positioned above the oxyanion hole, where it makes a hydrogen bond to the backbone amide of Cys-147. As observed foraldehyde inhibitors bound to 3C protease, the 144–145 peptide linkage has the backbone amide pointing away from the oxyanion hole,although low occupancy (�20%) of the other conformer having NH (residue 145) directed toward the oxyanion hole is seen in this andseveral other P1 Gln-containing Michael acceptors for which we have obtained high-resolution x-ray cocrystal structures. The ethylester portion of the Michael acceptor extends into the leaving group side of the protease active site formed by residues 22–25 and bythe tight loop connecting β-strands βA2 and βB2. The leaving group pocket is of sufficient size to accommodate the ethyl ester groupeasily in an extended low-energy Z conformation.

As noted previously, the stretch of amino acids 142–146 immediately N-terminal to the catalytic cysteine is important in 3Cproteases for both substrate recognition and stabilization of the tetrahedral intermediate-transition state. In the absence of boundligands, the corresponding residues in rhinovirus, poliovirus, and hepatitis A 3C proteases exist in multiple conformations and/or arehighly mobile, as evidenced by average temperature factors of 50–60 Å2. In rhinovirus 3C protease cocrystal structures with inhibitorshaving Gln-like side chains, the segment 142–146 adopts a well defined conformation (except for the 144–145 peptide linkage, whichhas either of two conformations) with temperature factors below the average for the remainder of the protein. Thus, Gln side-chainrecognition in the S1 pocket is tightly coupled with a disorder-to-order transition in a crucial region of the protein involved in transition-state stabilization. The available crystallographic evidence suggests that peptides lacking Gln-like functionality at P1 are unable toselect the catalytically relevant conformation for the protein segment 142–146 from an ensemble of accessible states, providing astructural explanation for the observation that proteolysis of short 7- to 11-aa peptides by 3C protease has an absolute requirement forGln at the P1 position (12). This observation also underscores the probable importance of P1 Gln functionality in mechanism-basedactivation of Michael acceptors as inhibitors of 3C protease.

Covalent irreversible inactivation of 3C by Michael acceptors proceeds according to a kinetic mechanism that can be broken downinto two parts (Scheme 1).

STRUCTURE-ASSISTED DESIGN OF MECHANISM-BASED IRREVERSIBLE INHIBITORS OF HUMAN RHINOVIRUS 3CPROTEASE WITH POTENT ANTIVIRAL ACTIVITY AGAINST MULTIPLE RHINOVIRUS SEROTYPES

11003

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 51: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

FIG. 3. Compound III bound to serotype 2 human rhinovirus 3C protease. Color coding is the same as in Fig. 2.

The inhibitor initially forms a reversible encounter complex with 3C, which can then undergo a chemical step (nucleophilic attackby Cys-147) leading to stable covalent-bond formation. The observed second-order rate constant for inactivation (kobs/I) depends onboth the equilibrium binding constant k2/k1 and the chemical rate for covalent bond formation k3 (23). We anticipated that Michael-acceptor inhibitors with specificity for 3C protease would likely achieve high rates of enzyme inactivation by combining goodequilibrium binding with a modest rate of covalent-bond formation. The rate of chemical inactivation presumably depends on not onlythe intrinsic electrophilic character of the inhibitor, but on how the reactive vinyl group is oriented in the active site relative to Cys-147before nucleophilic attack and on the extent to which the transition state for the reaction can be stabilized by the enzyme. Mechanism-based activation of an inherently weak Michael acceptor as a means of increasing the rate of the chemical step, and thus kobs/I, isconceptually more attractive than attempting to achieve a similar effect by simply increasing intrinsic electrophilic reactivity, whichwould likely impart undesirable properties to such compounds.

Within this conceptual framework, we experimented first with the effect of varying the Michael-acceptor electron-withdrawinggroup and then, for a subset of electrophiles with suitable antiviral and toxicity profiles, proceeded to a second level of optimizationinvolving the 3C protease recognition portion of compound III.

Michael-Acceptor Inhibitors of 3C Protease: Structure-Activity Studies

Variation of the Michael Acceptor. Recently, an extensive structure-activity study exploring modification of the Michael-acceptor portion of compound III has been published (24). The results can be summarized as follows. (i) A series of ester-derivedMichael acceptors with substituted alcohol groupsallshowed good inhibitory activity with kobs/I values of 3,000 to 40,000 M–1.s–1. Thebenzyl ester had higher anti-3C protease activity than the parent compound (kobs/I=39,400 compared with 25,000 M–1.s–1 for compoundIII) but performed worse in the antiviral assay (EC50=3.2 vs. 0.54 µM for compound III). cis-α,β-Unsaturated esters or trans-α,β-unsaturated esters substituted at the α-position had reduced activity compared with the benchmark compound III. (ii) Amide-containingMichael acceptors in general had reduced activity against 3C protease, poorer antiviral activity, and/or increased toxicity comparedwith the corresponding esters. (iii) Aliphatic and aryl α,β-unsaturated ketones were extremely potent anti-3C protease agents with kobs/Ivalues between 120,000 and 500,000 M–1.s–1

. However, these molecules had reduced antiviral activity (EC50 > 2 µM) and were toxic tocells. The ketones were also inactivated by short exposure to DTT, consistent with their expected high electrophilicity. (iv) Vinylsulfones, nitriles, phosphonates, oximes, and several vinyl heterocycles had weak (kobs/I < 600 M–1.s–1) or no detectable inhibitoryactivity, (v) Michael acceptors with acyl lactam, acyl oxazolidinone, and acyl urea functionalities were potent 3C protease inhibitorsbut, like the corresponding ketones, were inactivated by exposure to nonenzymatic thiols.

As a consequence of their good inhibitory activity against 3C protease, their encouraging antiviral activity, stability in the presenceof nonenzymatic thiols, low cellular toxicity, and ease of synthesis, trans-α,β-unsaturated esters emerged as the Michael acceptors ofchoice with which to initiate the process of optimizing the peptidic portion of compound III.

STRUCTURE-ASSISTED DESIGN OF MECHANISM-BASED IRREVERSIBLE INHIBITORS OF HUMAN RHINOVIRUS 3CPROTEASE WITH POTENT ANTIVIRAL ACTIVITY AGAINST MULTIPLE RHINOVIRUS SEROTYPES

11004

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 52: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Variation of 3C Protease Recognition Elements. Analogs of compound III truncated after the P1 Gin or after the P2 Phe werepoor 3C protease inhibitors with kobs/I values of 4.5 and 400 M–1.s–1, respectively. Therefore, structure-activity studies were conductedwith tripeptide-derived molecules (24).

Substitutions at P1. Michael acceptors incorporating any variation in the γ-carboxamide portion of the P1 side chain had weak orno 3C protease inhibitory activity. Inclusion of various heteroatoms in the aliphatic portion of the glutamine side chain also reducedactivity compared with the benchmark molecule III (24). As described above, the serotype 2 3C protease cocrystal structure withcompound III indicates that the P1 side-chain cis-NH is exposed to solvent. Selective alkylation of the amide was viewed as a means ofreducing inhibitor peptide character without compromising binding. We enforced cis-amide geometry by incorporating a P1 lactammoiety into the inhibitor design. Based on modeling, we predicted that (S) stereochemistry would be required at the lactam α-carbon toposition correctly lactam side-chain hydrogen bonding functionality, which is essential for recognition and binding in the S1 pocket.The resulting molecule was 10-fold more potent than compound III against type 14 3C protease and more than 5-fold better as anantiviral agent in cell culture (25).

Substitutions at P2. Replacement of the P2 benzyl side chain generally leads to reduced inhibitory properties. Smaller aliphaticside chains having fewer van der Waals contacts with the large S2 specificity pocket are particularly poor inhibitors. In the case of type14 3C protease, additional functionality at the 4-position can lead to modestly higher kobs/I values; however, the same compounds whentested against 3C from other rhinovirus serotypes were often less inhibitory than compound III. The 4-fluoroPhe analog was moderatelymore potent than the parent compound in assays against 3C protease from serotypes 2, 14, and 16 (24). The P2 backbone amide ofcompound III donates a hydrogen bond to the side-chain oxygen of invariant Ser-128. Ser-128 is located in a turn on an exposed,somewhat flexible loop forming one side of the S2 specificity pocket (Fig. 3). Various 3C protease cocrystal structures indicate that thisloop can undergo small (�1.5-Å) inhibitor-specific conformational changes. We reasoned that replacement of the P2–P3 peptide bondwith ketomethylene functionality would reduce the peptidic character of the resulting molecule, whereas loss of the exposed surfacehydrogen bond might not impact inhibitory activity severely. The ketomethylene inhibitor showed slightly reduced 3C proteaseinhibition (17,400 M–1.s–1), compared with that of compound III, but had improved antiviral properties (26).

Substitutions at P3. The leucine side chain of compound III is solvent exposed. As expected, a wide variety of functionality istolerated at this position with minor effects on enzyme inhibitory activity (24).

Substitutions at P4. Attempts to optimize the N-terminal (P4) functionality focused initially on modifications to the benzyl portionof the CBZ group to enhance binding in the hydrophobic S4 specificity pocket. We were also interested in exploring replacements forthe carbamate oxygen atom adjacent to the benzyl group. The cocrystal structure of compound III with serotype 2 3C protease (Fig. 3)reveals that this inhibitor oxygen atom is positioned partially inside the S4 pocket with a gap between it and the side chain of Phe-170(24). The thiocarbamate analog of CBZ had significantly increased inhibitory activity (kobs/I=280,000 M–1.s–1) and improved antiviralproperties (EC50=0.27 µM). A 1.9-Å crystal structure of the thiocarbamate analog of compound III bound to serotype 2 3C proteaseindicated that the thiocarbamate sulfur atom lies 1.5 Å deeper in the S4 pocket than the corresponding oxygen of compound III and is invan der Waals contact with Phe-170 (24). Replacement of oxygen with the larger, more easily polarized, and more easily dehydrated Satom probably accounts for much of the increase in kobs/I by enhancing equilibrium binding of the inhibitor to 3C protease beforecovalent-bond formation.

Concerns about possible metabolic instability of P4 thiocarbamate containing 3C protease inhibitors prompted a more systematicsearch for other N-terminal amides with improved activity compared with compound III. Tripeptidyl ethyl propenoate Michaelacceptors of sequence Leu-Phe-Gln were assembled on solid supports. The N-terminal amine was coupled to a variety of carboxylicacids and acid chlorides to yield approximately 500 N-terminal protected tripeptide Michael acceptors. These compounds werescreened subsequently against type 14 3C protease by using high-throughput assay techniques (27). Accordingly, the N-terminal 5-methylisoxazole-3-carboxamide analog was identified as a potent 3C protease inhibitor (kobs/I=260,000 M–1.s–1) with improvedantiviral activity (EC50=0.25 µM) compared with that of compound III.

AG7088, a 3C Protease Inhibitor with Potent Antiviral Activity Against Multiple Human RhinovirusSerotypes

For each position in the N-terminal protected tripeptide portion of compound III, modifications were identified that impartedincreased activity against 3C protease and better antiviral properties compared with those of the parent molecule. We anticipated thatby combining several of these individually beneficial modifications into a single molecule, further improvements in enzyme inhibitionand antiviral activity could be achieved. Below, the inhibitory, antiviral, and enzyme-specificity properties of one such compound,AG7088, are described further.

Activity Against Rhinovirus 3C Protease. The covalent structure of AG7088 is shown in Fig. 1. The compound has excellentactivity against serotype 14 3C protease (kobs/I=1, 470,000 M–1.s–1) and is a potent antiviral agent with low toxicity in the HeLa cellassay (EC50=0.013 µM; toxic concentration, 50% > 100 µM; ref. 28). AG7088 is highly specific for picornavirus 3C proteases, havingnegligible inhibitory activity against a panel of mammalian cysteine and serine proteases, including cathepsin B, elastase,chymotrypsin, trypsin, thrombin, and calpain (25). Direct inhibition of rhinovirus 3C proteolytic activity in virally infected H1-HeLacells treated with AG7088 can be inferred from dose-dependent accumulations of viral precursor proteins shown by SDS/PAGEanalysis of radiolabeled polyproteins (28). A crystal structure of AG7088 bound to serotype 2 3C protease was determined at 1.85-Åresolution (Fig. 4).

The overall binding mode of AG7088 to 3C protease is generally similar to that described for compound III; however, thestructurally distinct N-terminal protecting groups are oriented differently in the protein’s S4 binding subsite. As anticipated, the five-member lactam ring at P1 makes three hydrogen bonds with the protease similar to those for com-pound III. However, as a result ofconstraints imposed on the internal geometry of the lactam ring, the hydrogen bond between the lactam amide NH and the backbonecarbonyl of Thr-142 is longer (3.2 Å) and the geometry less favorable than in the case of compound III, in which optimal positioning ofthe P1 carboxamide by rotation about the Cδ–Cγ bond is less hindered. Why then does replacement of the P1 Gin in compound III witha five-member lactam ring increase kobs/I against type 14 3C protease by almost a factor of 10?

The relatively rigid lactam side chain at P1 stands to lose less conformational entropy on binding in the S1 pocket than the moreflexible Gin and therefore probably binds tighter to 3C protease than its acyclic counterpart. Another favorable effect of binding onentropy may result from the manner in which the lactam affects the conformation of unbound AG7088 in solution. We have determinedthe small-molecule crystal structure

STRUCTURE-ASSISTED DESIGN OF MECHANISM-BASED IRREVERSIBLE INHIBITORS OF HUMAN RHINOVIRUS 3CPROTEASE WITH POTENT ANTIVIRAL ACTIVITY AGAINST MULTIPLE RHINOVIRUS SEROTYPES

11005

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 53: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

of AG7088 (T.L.Hendrixson, unpublished results) and find that its conformation is very similar to that observed for AG7088 incomplex with 3C protease. In both cases, the two lactam (CH2) groups pack against the side chain of the P3 valine, which may helpstabilize the active conformer in solution thus reducing entropy loss on inhibitor binding. The two lactam (CH2) groups also createadditional van der Waals contacts with backbone atoms of residues 143 and 144, which, compared with a P1 Gln, may further reducethe flexibility and conformational heterogeneity that is observed for this region in the absence of bound inhibitors. Particularlynoteworthy is that, for AG7088 bound to 3C protease, the peptide bond 144–145 has its NH pointing in toward the oxyanion holewhere it may play a role in hydrogen bonding to the carbonyl oxygen of the Michael acceptor in the transition state for Michaeladdition. We have determined cocrystal structures for five P1 cyclic lactam-containing 3C protease inhibitors, and in each case, the144–145 peptide is in what we believe to be the active conformation. In contrast, more that 20 cocrystal structures of P1 Gln-containingirreversible 3C protease inhibitors aII show this peptide bond turned around with the backbone NH group pointing out into solvent (seeFig. 3). These results suggest that the greater ability of a P1 lactam to stabilize the catalytically active conformation of residues N-terminal to the nucleophilic Cys-147 may accelerate the chemical step and thus contribute to the increase in kobs/I compared with P1Gln-containing analogs.

FIG. 4. AG7088 bound to serotype 2 rhinovirus 3C protease. The protein is rendered as a semitransparent solvent-accessiblesurface color coded at each residue according to amino acid conservation among the 10 serotypically distinct rhinovirus β Cproteases of known primary structure. Residues indicated in dark blue are identically conserved among the 10 knownsequences. Increasing amino acid variation at a particular residue is indicated by progressively warmer coloring, with purplesignifying two differences and red signifying seven differences among the 10 known 3C protease sequences. Other colorcoding is the same as in Fig. 2, except that the fluorine atom of AG7088 is purple.

In compound III, the P2 backbone amide donates a hydrogen bond to the side-chain hydroxyl of Ser-128. As a consequence ofreplacing this group with a methylene moiety in AG7088, surface-exposed Ser-128 moves 0.7 Å where it can interact preferentiallywith bulk solvent. The isoxazole group of AG7088 is more buried in the S4 pocket than the CBZ of compound III and is orientedorthogonal to the CBZ benzene ring. The isoxazole oxygen is positioned close to the side chain of Phe-170, which moves on averageabout 0.6 Å compared with its position in the complex with compound III. Deeper penetration of this group into S4 also causespositional changes (�0.8 Å) centered around the backbone and side-chain atoms of Asn-165 with somewhat smaller displacements forGly-166 as well. One consequence of these induced protein movements is that the shape of the S1 pocket changes slightly, particularlyin the region proximate to the P1 side-chain amide and its attached methylene, suggesting that alterations in the N-terminal blockinggroup can affect binding of the P1 substituent.

Antiviral Activity of AG7088 Against Rhinovirus Serotypes. In H1-HeLa or MRC-5 cell protection assays, AG7088 inhibitedreplication of all 48 rhinovirus serotypes tested to date (28), including representative virus strains derived from minor and majorreceptor groups (29). The mean EC50 and EC90 values are 0.023 µM (range: 0.003–0.081 µM) and 0.082 µM (range: 0.018–0.261 µM),respectively (28). Pirodavir and pleconaril are antipicornaviral agents that bind to viral capsids, preventing receptor attachment and/orviral uncoating. Pirodavir inhibited the replication of 42 of 47 rhinovirus serotypes tested with a mean EC50 value of 0.32/µM (range:0.003–4.770 µM), whereas pleconaril inhibited replication of 42 of 45

STRUCTURE-ASSISTED DESIGN OF MECHANISM-BASED IRREVERSIBLE INHIBITORS OF HUMAN RHINOVIRUS 3CPROTEASE WITH POTENT ANTIVIRAL ACTIVITY AGAINST MULTIPLE RHINOVIRUS SEROTYPES

11006

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 54: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

serotypes tested with a mean EC50 value of 0.822 µM (range: 0.003–8.112 µM) (28). The 50% cytotoxic concentration of AG7088 is >1,000 µM compared with 150 µM and 77 µM for pirodavir and pleconaril, respectively (28). These studies establish AG7088 as ahighly potent, nontoxic antirhinoviral agent with broad efficacy against multiple virus serotypes. The compound has been formulatedfor intranasal delivery and has recently entered clinical trials.

Experimental Crystal Structure of AG7088 Bound to Serotype 2 Rhinovirus 3C Protease. Serotype 2 human rhinovirus 3Cprotease was incubated with a 3-fold molar excess of AG7088 in the presence of 2% (vol/vol) DMSO for 24 h at 4°C. The complexwas concentrated to 6.8 mg/ml and then passed through a 0.22-µm cellulose-acetate filter. Crystals were grown at 13°C by using ahanging-drop vapor-diffusion method in which equal volumes (3 µl) of the protein-ligand complex and reservoir solution were mixedon plastic coverslips and sealed over individual wells filled with 1 ml of reservoir solution containing 20% (vol/vol) polyethyleneglycol (molecular weight 10,000) and 0.1 M Hepes (pH 7.5).

A single crystal measuring 0.3×0.1×0.1 mm (space group P212121; a=34.32, b=65.68, c=77.89 Å) was prepared for low-temperature data collection by transfer to an artificial mother liquor solution consisting of 400 µl of the reservoir solution mixed with125 µl of glycerol and then flash frozen in a stream of N2 gas at –170°C. X-ray diffraction data were collected with a MAR Research345-mm imaging plate and processed with DENZO. Diffraction data were 89.2% complete to a resolution of 1.85 Å with R(sym)=1.9%.Protein atomic coordinates from the cocrystal structure of type 2 3C protease with compound I (15) were used to initiate rigid-bodyrefinement in X-PLOR followed by simulated annealing and conjugate gradient minimization protocols. Placement of the inhibitor,addition of ordered solvent, and further refinement proceeded as described in ref. 15. The final R factor was 21.8% [12,184 reflectionswith F > 2σ(F)]. The root-mean-square deviations from ideal bond lengths and angles were 0.016 Å and 2.9°, respectively. The finalmodel consisted of all atoms for residues 1–180 (excluding the side chain of residues 12, 21, 45, and 65) plus 221 water molecules.1. Kräusslich, H.G. & Wimmer, E. (1988) Annu. Rev. Biochem. 57, 701–754.2. Kay, J. & Dunn, B.M. (1990) Biochem. Biophys. Acta 1048, 1–8.3. Lawson, M.A. & Semler, B.L. (1990) Curr. Top. Microbiol. Immunol. 161, 49–87.4. Roehl, H.H., Parsley, T.B., Ho, T.V. & Semler, B.L. (1997) J.Virol. 71, 578–585.5. Leong, L.E.C., Walker, P.A. & Porter, A.G. (1993) J. Biol Chem. 268, 25735–25739.6. Andino, R., Rieckhof, G.E., Achacoso, P.L. & Baltimore, D. (1993) EMBO J. 12, 3587–3598.7. Xiang, W., Harris, K.S., Alexander, L. & Wimmer, E. (1995) J. Virol. 69, 3658–3667.8. Sperber, S.J. & Hayden, F.G. (1988) Antimicrob. Agents Chemother. 32, 409–419.9. Matthews, D.A., Smith, W.W., Ferre, R.A., Condon, B., Budahazi, G., Sisson, W., Villafranca, J.E., Janson, C.A., McElroy, H.E., Gribskov, C.L., et

al. (1994) Cell 77, 761–771.10. Allaire, M., Chernaia, M.M., Malcolm, B.A. & James, M.N.G. (1994) Nature (London) 369, 72–76.11. Mosimann, S.C., Cherney, M.M., Sia, S., Plotch, S. & James, M.N.G. (1997) J. Mol. Biol. 273, 1032–1047.12. Long, L.A, Orr, D.C., Cameron, J.M., Dunn, B.M. & Kay, J. (1989) FEBS Lett. 258, 75–78.13. Malcolm, B.A, Lowe, C., Shechosky, S., Mckay, R.T., Yang, C.C., Shah, V.J., Simon, R.J., Vederas, J.C. & Santi, D.V. (1995) Biochemistry 34,

8172–8179.14. Shepherd, T.A., Cox, G.A., McKinney, E., Tang, J., Wakulchik, M., Zimmerman, R.E. & Villarreal, E.C. (1996) Bioorg. Med. Chem. Lett. 6, 2893–

2896.15. Webber, S.E., Okano, K., Little, T.L., Reich, S.H., Xin, Y., Fuhrman, S.A., Matthews, D.A., Love, R.A., Hendrickson, T.F., Patick, A.K., III, et al.

(1998) J. Med. Chem. 41, 2786–2805.16. Kaldor, S.W., Hammond, M., Dressman, B.A, Labus, J.M., Chadwell, F.W., Kline, A.D. & Heinz, B.A. (1995) Bioorg. Med. Chem. Lett. 5, 2021–

2026.17. Burley, S.K. & Petsko, G.A. (1988) Adv. Protein Chem. 39, 125–153.18. Webber, S.E., Tikhe, J., Worland, S.T., Fuhrman, S.A, Hendrickson, T.F., Matthews, D.A., Love, R.A., Patick, A.K., Meador, J.W., Ferre, R.A., et

al. (1996) J. Med. Chem. 39, 5072–5082.19. Sanderson, P.E.J. & Naylor-Olsen, A.M. (1998) Curr. Med. Chem. 5, 289–304.20. Hanzlik, R.P. & Thompson, S.A. (1984) J. Med. Chem. 27, 711–712.21. Liu, S. & Hanzlik, R.P. (1992) J. Med. Chem. 35, 1067–1075.22. Dragovich, P.S., Webber, S.E., Babine, R.E., Fuhrman, S. A, Patick, A.K., Matthews, D.A., Lee, C.A., Reich, S.H., Prins, T.J. & Marakovits, J.T.

(1998) J. Med. Chem. 41, 2806–2818.23. Meara, J.P. & Rich, D.H. (1995) Bioorg. Med. Chem. Lett. 5, 2277–2282.24. Dragovich, P.S., Webber, S.E., Babine, R.E., Fuhrman, S.A., Patick, A.K., Matthews, D.A., Reich, S.H., Marakovits, J.T., Prins, T.J. & Zhou, R.

(1998) J. Med. Chem. 41, 2819–2834.25. Dragovich, P.S., Webber, S.E., Babine, R.E., Fuhrman, S.A., Patick, A.K., Matthews, D.A, Reich, S.H., Marakovits, J.T., Prins, T.J. & Zhou, R.

(1999) J. Med. Chem. 42, 1213–1224.26. Dragovich, P.S., Prins, T.J., Zhou, R., Fuhrman, S.A., Patick, A.K., Matthews, D.A., Ford, C.E., Meador, J.W., Ferre, R.A. & Worland, S.T. (1999)

J. Med. Chem. 42, 1203–1212.27. Dragovich, P.S., Zhou, R., Skalitzky, D.J., Fuhrman, S.A., Patick, A.K., Ford, C.E., Meador, J.W. & Worland, S.T. (1999) Bioorg. Med. Chem. Lett.

7, 589–598.28. Patick, A.K., Binford, S.L., Brothers, M.A., Jackson, R.L., Ford, C.E., Diem, M.D., Maldonado, F., Dragovich, P.S., Zhou, R., Prins, T.J., et al.

(1999) Antimicrob. Agents Chemother, in press.29. Uncapher, C.R., DeWitt, C.M. & Colonno, R.J. (1991) Virology 180, 814–817.

STRUCTURE-ASSISTED DESIGN OF MECHANISM-BASED IRREVERSIBLE INHIBITORS OF HUMAN RHINOVIRUS 3CPROTEASE WITH POTENT ANTIVIRAL ACTIVITY AGAINST MULTIPLE RHINOVIRUS SEROTYPES

11007

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 55: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Kinetic stability as a mechanism for protease longevity

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

ERIN L. CUNNINGHAM, SHEILA S. JASWAL, JULIE L. SOHL*, AND DAVID A. AGARD†

Graduate Group in Biophysics, Howard Hughes Medical Institute, and Department of Biochemistry and Biophysics, University ofCalifornia, San Francisco, CA 94143–0448

ABSTRACT The folding of the extracellular serine protease, α-lytic protease (αLP; EC 3.4.21.12) reveals a novel mechanism for stability that appears to lead to a longer functional lifetime for the protease. For αLP, stability is based not onthermodynamics, but on kinetics. Whereas this has required the coevolution of a pro region to facilitate folding, the result hasbeen the optimization of native-state properties independent of their consequences on thermodynamic stability. Structural andmutational data lead to a model for catalysis of folding in which the pro region binds to a conserved β-hairpin in the αLP C-terminal domain, stabilizing the folding transition state and the native state. The pro region is then proteolytically degraded,leaving the active αLP trapped in a metastable conformation. This metastability appears to be a consequence of pressure toevolve properties of the native state, including a large, highly cooperative barrier to unfolding, and extreme rigidity, thatreduce susceptibility to proteolytic degradation. In a test of survival under highly proteolytic conditions, homologousmammalian proteases that have not evolved kinetic stability are much more rapidly degraded than αLP. Kinetic stability as ameans to longevity is likely to be a mechanism conserved among the majority of extracellular bacterial pro-proteases and may emerge as a general strategy for intracellular eukaryotic proteases subject to harsh conditions as well.

Virtually all extracellular bacterial proteases are synthesized as precursor molecules with pro regions. In every case where thefunction of the pro region has been investigated, it has been found to be necessary for folding and secretion (1). One of the moststriking and best studied examples of pro-mediated folding is the bacterial enzyme, α-lytic protease (αLP). αLP (EC 3.4.21.12) is a 198-aa serine protease secreted by the Gram-negative soil bacterium Lysobacter enzymogenes to degrade other soil microorganisms. Theoverall three-dimensional fold of αLP clearly places it in the same family as the mammalian digestive serine proteases chymotrypsin,trypsin, and elastase, despite only moderate sequence homology (2). In contrast to these mammalian homologues, whose small N-terminal zymogen peptides simply prevent premature activation, αLP is synthesized with a large 166-aa N-terminal pro region (Pro)that is required for proper folding of its mature protease domain (3). In vivo, coexpression of αLP and Pro, either in cis as the naturalprecursor molecule or in trans as two separate polypeptide chains results in the secretion of active αLP (4), whereas expression of αLPalone leads to accumulation of the protease in the outer membrane because of apparent misfolding.

In vitro energetic studies reveal a novel means of stability for the mature protease arising from kinetics. This not onlydistinguishes αLP from its mammalian homologues but provides compelling support for the possibility of metastable nativeconformations in general. Emerging structural and energetic details of pro-mediated folding may define a theme for the folding of awide range of homologous extracellular proteases that also contain pro regions. In addition, features of αLP’s kinetic barrier mayprovide insight into other proteins with metastable conformations of biological importance. Here we describe the role of kineticstability in αLP folding, details of pro-αLP interactions and a possible mechanism and an evolutionary rationale for pro-mediatedfolding of αLP.

Folding Under Kinetic Control. To understand the requirement of the pro region for folding, the folding free-energy landscapeof αLP has been mapped in the absence of the pro region through refolding and unfolding experiments. In vitro refolding of chemicallydenatured αLP in the absence of Pro, by dilution from denaturant, results in an inactive molten globule-like intermediate (5).Designated the “I” state of the protease, this intermediate is monomeric and greatly expanded relative to the native state “N.”Spectroscopic data indicate that I possesses substantial secondary structure but lacks stable tertiary interactions. Urea-denaturationexperiments show that I has <1 kcal/mol (1 cal=4.18 J) stability over the unfolded state “U” (6). Under physiological conditions, theαLP I state remains stable for months without any appreciable conversion to mature enzyme. The very small fraction of I that doesmature to the active N state can be measured by using a very sensitive enzymatic assay. From this assay, it has been determined that theI state refolds with an extremely slow initial rate of 1.18×10–11 s–1 at 4°C (t1/2=1,800 years), corresponding to a folding barrier height of30 kcal/mol, as calculated by transition state theory (Fig. 1; ref. 6).

This remarkably high barrier prevents the intermediate and native states from being in equilibrium with each other. Therefore, todetermine the relative stability of these conformations, it is necessary to compare the ratio of the folding and unfolding rates instead ofusing the usual equilibrium approaches. To measure the unfolding rate on a reasonable time scale, the N state must be chemically orthermally denatured. To avoid complications of autolysis when studying the rates of αLP unfolding, the active-site serine has beenmutated to alanine (S195 → A; chymotrypsin homology numbering from ref. 7). During unfolding, both secondary and tertiarystructure (monitored via CD and tryptophan fluorescence, respectively) are lost simultaneously in a single rate-limiting step (6).Extrapolating the data to zero denaturant gives an unfolding rate of 1.8×10–8 s–1 at 4°C or an unfolding barrier of 26 kcal/mol (t1/2=1.2years). The ratio of this unfolding rate to the slower folding rate results in an equilibrium free energy that favors the I state by 4 kcal/mol. Over a broad range of temperatures, the I state of αLP, not the N state, is at the

*Present address: Department of Molecular and Cellular Biology, University of California, Berkeley, CA 94720.†To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviation: αLP, α-lytic protease.

KINETIC STABILITY AS A MECHANISM FOR PROTEASE LONGEVITY 11008

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 56: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

minimum free energy. In fact, because of the marginal stability of I, the N state is actually significantly less thermodynamically stablethan either the I or U states.

FIG. 1. Free-energy diagram of αLP folding with and without its pro region at 4°C. In the absence its pro region (P), unfoldedαLP (U) spontaneously folds to a molten globule-like intermediate (I), which proceeds at an extremely slow rate to N througha high-energy folding TS. The addition of pro region provides a catalyzed folding pathway (denoted by dashed lines) thatlowers the high folding barrier and results in a thermodynamically stable inhibition complex N-P. * indicates measurement at25°C. (Modified from ref. 6.)

To surmount the high barrier to folding and the extraordinary thermodynamic instability of the native state, αLP has coevolved thepro region, which can assist the folding of αLP when supplied in cis or in trans. Addition of Pro to I results in rapid folding to the Nstate (0.037 s–1) and recovery of functional protease (Fig. 1; ref. 6). Pro acts as a foldase, facilitating αLP folding by binding tightly tothe folding transition state of the protease, lowering the barrier by 18.2 kcal/mol. In this manner, Pro serves as a potent catalyst,increasing the rate of αLP folding by 3×109. In addition, Pro is the tightest binding inhibitor known for the native protease (Ki=3×10–10

M; refs. 8 and 9), making Pro a single-turnover catalyst. This tight binding serves a critical function in αLP folding by shifting thethermodynamic equilibrium in favor of folded αLP (Pro·N is 3.4 kcal/mol more stable than Pro-I; Fig. 1).

The product of the folding reaction is not active αLP but the inhibitory complex. Release of active αLP requires that the Pro regionbe removed by proteolysis. Once Pro is degraded, the active protease becomes kinetically trapped in the metastable N state, with thehigh barrier preventing unfolding to the more thermodynamically favored unfolded states. In this way, promediated folding providesthe only efficient means of folding αLP to its metastable native conformation.

FIG. 2. (a) Topology of Pro as described in the text. A disordered loop in the Pro C domain is shown in red. (b) Schematic ofprimary sequence alignments of pro regions from nine bacterial serine proteases. Alignments were determined by using theαLP Pro structure as a guide. Regions of sequence homology correspond to specific secondary structures in the Pro structure,with the Pro C-terminal domain being the most conserved region. N-terminal sequences lacking homology are depicted bythin black lines. αLP, Lysobacter enzymogenes αLP (17); SGPC, Streptomyces griseus protease C (18); RPI, Rarobacterfaecitabitus protease I (19); SGPD, S.griseus protease D (20); SGPE, S.griseus protease E (21); TFPA, Themomonasporafusca serine protease (22); SAL, Streptomyces lividans protease (23); SGPA, S.griseus protease A (24); SGPB, S.griseusprotease B (24).

Structures of Pro and Pro·αLP Complex. Recently determined crystal structures of Pro and the Pro·N complex illuminatePro·αLP interactions (10). Alone, Pro adopts a novel C shaped α/β-fold, consisting of an N-terminal helix, two compact globulardomains (N domain, C domain) connected by a nearly rigid hinge region, and a C-terminal tail (Fig. 2a). Each globular domaincontributes a three-stranded β-sheet to the concave surface of the molecule and at least one α-helix that packs against these β-sheets toform the convex surface. The N-terminal helix appears highly flexible, changing orientations in different crystal environments. Two ofthe three Pro molecules in the crystallographic asymmetric unit show different conformations for the N-terminal helix, whereas thethird molecule reveals the helix to be disordered. Similarly, the

KINETIC STABILITY AS A MECHANISM FOR PROTEASE LONGEVITY 11009

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 57: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

C-terminal tail is unseen in the Pro structure and apparently disordered in unbound Pro.Sequence comparisons with homologous pro-proteases suggest that the Pro structure may be a common pro region fold. Primary

sequence alignments of Pro and eight related pro regions (Fig. 2b) indicate that these homologous pro regions share common secondary-structure elements, the most conserved region being that of the Pro C-terminal domain, despite a wide range of pro region sizes. Thesepro regions appear compatible with the Pro structure and presumably exhibit similar mechanisms of foldase activity.

In the case of αLP, Pro·N complex formation does not significantly alter the Pro structure (Fig. 3a). This is surprising because thepro region by itself has quite limited stability (Tm =28.5°, 2.3 kcal/mol; ref. 11), whereas Pro·N complex is greatly stabilized (13.6 kcal/mol; ref. 6). However, the only notable differences are seen in the structuring of the C-terminal tail and the positioning of the flexible N-terminal helix on protease binding. As expected for a tight-binding inhibitory complex, the Pro·N complex structure buries a very largesurface (>4,000 Å2) in its intermodular interface.

The most striking feature of the complex structure is the fact that Pro binds almost exclusively to the αLP C domain, effectivelysurrounding the αLP C-terminal β-barrel. This observation raises the distinct possibility that it is the αLP C domain that cannot foldproperly and is therefore the focused substrate of Pro foldase activity. In support of this, recent mutagenesis studies indicate that thestructuring of the protease C domain is an integral part of the high folding barrier. Screens of libraries of chemically mutagenized αLPreveal that mutations that lower the folding transition state (as much as 3 kcal/mol) all map to the C domain of the protease (A.Dermanand D.A.A., unpublished data). The most extensive and complementary interactions in the Pro·N interface occur between the proteaseand the Pro C-terminal domain. In particular, the three-stranded β-sheet in the Pro C domain pairs with an extended β-hairpin in theαLP C domain (αLP residues 166–179; chymotrypsin numbering) to form a continuous five-stranded β-sheet. Additional interactionscome from the insertion of the Pro C-terminal tail into the protease active site. The Pro C tail binds in a substrate-like manner todirectly occlude the protease active site, as predicted by biochemical data (25). Placement of the C tail also provides a binding pocketfor the tip of the β-hairpin.

αLP Folding Barrier and Pro Foldase Mechanism. The integration of prominent features of the complex structure withmutagenesis studies on both Pro and αLP provides significant insights into the origin of the folding barrier and the mechanism of Pro-catalyzed folding. Because Pro acts as a folding catalyst, it is possible to use modified Michaelis-Menten kinetics to extract functionalinformation about the folding reaction (8). This analysis provides information on the formation of the Pro·I Michaelis complex (Km)and the stabilization of the folding transition state (kcat). In addition, the stability of the Pro-N complex can be assessed by measuringthe inhibition of peptide substrate hydrolysis by Pro (Ki).

FIG. 3. (a) Ribbon diagram of the Pro-N complex structure. The αLP N and C domains are colored magenta and bluerespectively, with the side chains of the catalytic triad shown in red (His-57, Asp-102 and Ser-195; chymotrypsinnumbering). Illustrated in green, bound Pro inserts its C-terminal tail into the protease active site. A disordered loop in thePro C-terminal domain, indicated by an arrow, presents a likely secondary protease cleavage site, leading to the release ofactive αLP from the inhibitory complex, (b) Detail of the hydrated Pro-N interface. A gap between Pro (green) and the αLP Cdomain (blue) is filled by ordered water molecules which are shown as red spheres. Some of these waters mediate hydrogenbonds (dashed orange lines) between the αLP β-hairpin and the Pro three-stranded β-sheet that form the shared five-strandedβ-sheet of the Pro-N interface. Residues in the αLP β-hairpin that affect formation of the initial Pro·I·Michaelis complex(Ile-167 and Asn-170) are displayed in yellow. Figures are modified from figures 2b and 3b of ref. 10.

Mutations within the αLP β-hairpin loop alter both Km and kcat (8). The Km effects reveal that formation of the shared β-sheet mustoccur in the first step of Pro-catalyzed folding, whereas the kcat effects indicate that this extended sheet continues to play a role duringfolding catalysis. Unlike these hairpin mutations, removing residues from the Pro C tail (8) does not affect initial binding to the αLP Istate (Km) and only marginally affects αLP N state binding (Ki), despite the Pro C tail’s high complementarity to the αLP-bindingpocket. In marked contrast, these same Pro C tail truncations drastically reduce the folding rate (kcat), profoundly hindering the abilityof Pro to stabilize the folding transition state (TS). Deletion of the last three residues from the Pro C tail decreases kcat by �300-fold,and removal of an additional fourth residue decreases folding by at least a factor of 107. The Pro C tail therefore plays a direct role inPro foldase activity, preferentially stabilizing the folding TS over the I and N states. Preliminary data indicate that the Pro N domainalso contrib

KINETIC STABILITY AS A MECHANISM FOR PROTEASE LONGEVITY 11010

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 58: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

utes to the catalytic activity of Pro. Mutations in Pro at the protease-Pro N domain interface affect TS stabilization (E.L.C., P. Chien,and D.A.A., unpublished data).

FIG. 4. Proposed model of Pro-catalyzed folding of αLP. (a) The pro domain of the Pro-αLP precursor folds, while theprotease N and C domains remain separated and expanded, (b) The three-stranded β-sheet of the Pro C domain pairs with thesolvent-exposed β-hairpin of the αLP C domain forming a continuous five-stranded β-sheet, (c) Substrate-like binding of thePro-αLP junction to the nascent active site positions the β-hairpin and leads to the structuring of the αLP C domain, (d) TheαLP N domain folds on docking with the αLP C domain to complete the protease active site, which can then process the Pro-αLP junction. The Pro C-terminal tail remains bound to the active site in this inhibitory complex while the new αLP Nterminus repositions to its native conformation, (e) Intermolecular cleavage of secondary cleavage sites by αLP or otherexogenous proteases leads to the f, eventual degradation of Pro and release of active, mature αLP. Color scheme as in Fig. 3a.Figure is modified from figure 4 of ref. 10.

Kinetic and thermodynamic analyses suggest that the folding transition state and the native state must share many structuralfeatures. From the denaturant dependence of unfolding rates (S.S.J. and D.A.A., unpublished data), it is possible to infer that thefolding transition state is significantly closer to the native state than to the folding intermediate, I. Furthermore, the extremely tightbinding of the rigid Pro region to both the native state (13.6 kcal/mol) and the transition state (18.2 kcal/mol) suggests that at least theαLP C domain must be similarly structured in both states. However, they cannot be identical. Although the Pro C-terminal tail makesideal substrate-like interactions with αLP, deletions only minimally affect the stability of Pro-N, while causing profound effects on thefolding transition state (8). This suggests that the native Pro-N complex must be “strained” such that the total binding energy possiblefor the Pro C tail is not realized in the Pro-N complex. By contrast, the intrinsic binding energy of the Pro C tail does seem to be fullyrealized when complexed to the folding transition state, because it is stabilized by an additional 5 kcal/mol compared with Pro·N(Fig. 1).

Observations based on the Pro·N complex structure (10) suggest that this strain may be the result of poor complementarity inregions of the Pro·N interface, which could be improved to yield the observed additional stabilization in the Pro·TS complex. Mostnotably, there is a significant gap in the interface where the protease meets the junction of the two Pro domains. This gap contains eightordered solvent molecules, three of which act to mediate hydrogen bonds between the αLP β-hairpin and Pro β-strand. Such highlysolvated interfaces have been previously observed where two surfaces interact in two different conformational states. These “adapter”waters are seen in protein-DNA complexes(12) where waters populate the interface in nonspecific complexes yet are excluded in thespecific complex. Similarly, waters are often used to adapt quaternary changes in allosteric enzymes (13, 14), with fewer waters in thehigher-affinity state because of improved surface complementarity. The Pro·αLP TS may be similarly stabilized by excluding the boundwaters, thereby reducing the entropic cost of ordering the waters and increasing the direct Pro·αLP interface. Because the structure offree αLP is nearly identical to that of αLP complexed with Pro, it is probable that strong αLP N state interactions prevent optimizationof the Pro·αLP interface predicted in the TS complex. Destabilizing αLP mutations may disrupt these interactions enough to distort thePro-αLP complex toward more TS-like binding.

αLP Folding Model. This structural and mutagenesis data can be synthesized into a model of Pro-catalyzed folding of αLP(Fig. 4; ref. 10).

KINETIC STABILITY AS A MECHANISM FOR PROTEASE LONGEVITY 11011

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 59: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Table 1. Glycine content of LP relatives

Protease Glycines, no. Residues, no. Glycines, %αLP familyL.enzymogenes αLP 32 198 16.2R.faecitabitus protease I 29 174 16.7S.albogriseolus protease 20 32 172 18.6S.fradiae protease 1 31 186 16.7S.griseus protease A 32 182 17.6S.griseus protease B 32 185 17.3S.griseus protease C 35 190 17.9S.griseus protease D 32 188 17.0S.griseus protease E 32 183 17.5S.lividans protease 35 171 20.5S.livdans protease O 27 150 18.0T.fusca serine protease 35 186 18.8Trypsin familyTrypsin 24 224 10.7Chymotrypsin B 23 245 9.4Elastase 27 270 10.0Acrosin 27 436 6.2Achelase 1 protease 25 213 11.7α-Tryptase 19 245 7.8Batroxobin 29 231 8.7Carboxypeptidase A complex III 25 240 10.4Coagulation factor vii 37 406 9.1Collagenase 22 230 9.6Complement factor B 61 739 8.3Enteropeptidase 77 1035 7.4Glandular kallikrein 1 19 238 8.0Granzyme A 22 234 9.4Mast cell protease 7 22 244 9.0Natural killer cell protease 1 19 228 8.3Plasminogen 59 790 7.5Prostasin precursor 28 311 9.0Serine protease hepsin 44 417 10.6

In this folding scheme, the N and C domains of the expanded molten globule folding intermediate are separated, and the β-hairpinis exposed to solvent. Prefolded Pro initiates protease folding by binding to the hairpin, forming a continuous five-stranded β-sheet.Efficient folding requires the Pro C tail to then bind to the nascent active site, positioning the hairpin and thereby assisting thestructuring of the αLP C domain. Finally, the αLP N domain docks and folds against the C domain to complete both the catalytic triadand the packing of the N state core. Studies of the intact Pro-αLP precursor (11) support the proposed two-step folding model.Precursor refolding experiments show biphasic kinetics, with an initial fast rate equal to the rate of pro folding alone, followed by aslower rate for pro-mediated folding of αLP.

During in cis folding, formation of the active site allows the protease domain of the precursor to process the Pro-αLP junction,producing the two distinct polypeptides chains of the Pro·N complex. After cleavage, the Pro C tail remains bound to the active sitewhile the newly formed protease N terminus repositions to its native conformation 24 A away. Although the Pro–αLP precursor andPro·N complex show similarities in secondary and tertiary structure, the marginal stability of the precursor (2.2 kcal/mol; ref. 11)compared with the complex (10.6 kcal/mol) suggests that the rearrangement of the N terminus is critical to αLP N state stabilization. Inaddition to the primary intramolecular cleavage site, αLP also recognizes

FIG. 5. Advantages of kinetic stability, (a) A typical thermodynamically stable protein without a large barrier samples fullyand partially unfolded states, making it susceptible to proteolysis. (b) A kinetically stable protein only rarely samples theseunfolded states, making it much more resistant to proteolysis. In the case of αLP, the native state is less stable than theunfolded states; however, kinetic stability does not require a metastable native state, (c) αLP ( ` ) is more resistant toproteolysis than either trypsin ( ` ) or chymotrypsin ( ` ) . αLP (purified as described in ref. 25), trypsin (TPCK-treated,Worthington), and chymotrypsin (TLCK-treated, Worthington) (6.5 µM each) were mixed in 10 mM CaCl2, 50 mM Mops(pH 7.0) at 37°C. Aliquots were removed over time, and the survival of the individual proteases was measured based on theiractivities, which could be distinguished given their nonoverlapping specificities for different substrates (succinyl-Ala-Pro-Ala-pNA, succinyl-Ala-Ala-Pro-Arg-pNA, succinyl-Ala-Ala-Pro-Leu-pNA, used for αLP, trypsin, and chymotrypsin,respectively,allat 1 mM in 100 mM Tris, pH 8). Whereas αLP activity decreases at a rate of less than (600 hr)–1,chymotrypsin and trypsin are inactivated with rates of (4 hr)–1 and (60 hr)–1, respectively.

KINETIC STABILITY AS A MECHANISM FOR PROTEASE LONGEVITY 11012

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 60: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

intermolecular cleavage sites within Pro, eventually leading to Pro degradation and release of active, mature protease. The disorderedloop within the Pro C domain (Figs. 3 and 4E) presents a likely target for the requisite secondary cleavage event. This secondarycleavage site is sensitive to many other proteases besides αLP. In fact, there may be a functional synergism between the multipleproteases secreted simultaneously by the host, Lysobacter enzymogenes, in cleaving each other’s pro regions.

In the proposed folding scenario, Pro must bind to and correctly position the β-hairpin. The likely importance of this β-hairpin tothe αLP folding barrier is reflected in its selective conservation among related proteases within the chymotrypsin superfamily. The β-hairpin, a common structural motif found inall13 bacterial homologues synthesized with pro regions, is noticeably absent in otherrelated bacterial, viral, and mammalian proteases that do not require pro regions for proper folding. Furthermore, the Pro β-strand thatpairs with the hairpin loop is also highly conserved in homologous pro regions, suggesting that the hairpin and its interaction with thePro C domain are important in structuring the protease. This observation is consistent with the fact that smaller related pro regionsshow sequence homology only to the Pro C domain, thereby maintaining the core structure necessary for binding the β-hairpin of theprotease (Fig. 2b). The positioning of the hairpin and subsequent structuring of the αLP C domain may be a general mechanism for proregion-mediated folding of β-structures. In contrast, subtilisin, a pro-protease evolutionarily unrelated to αLP, seems to use a differentmethod of pro-catalyzed folding. The subtilisin pro domain stabilizes a pair of α-helices in the protease instead of a β-hairpin (15).Although αLP and subtilisin have convergently evolved pro-dependent folding, they differ in both their mature protease structures andthe method by which their respective pro regions achieve their active protease conformations.

Physical Origins of the Folding Barrier. Although the physical origins of the αLP folding barrier and the means by which Prolowers this barrier remain to be determined, examination of the enthalpic and entropic contributions to the free-energy differencebetween the αLP I and N states provides some insights into the nature of the folding barrier. Despite the thermodynamic instability ofthe dLP N state, titration calorimetry experiments reveal that it is enthalpically favored over the I state by 18 kcal/mol (6). Thus, thethermodynamic stability of the I state must be entropic in origin. This means that either the I and U states are more entropically favoredthan in “normal” proteins, the αLP N state has lower entropy than normal, or both. One possible source of this excess entropy for I maybe the high percentage of glycines found in the αLP sequence. Because glycine residues lack a side chain, they can avoid steric clashesencountered by other amino acids, thereby increasing the number of accessible conformations in the unfolded states. αLP contains 16%glycines compared with only 9% in the homologous but thermodynamically stable chymotrypsin. The 10 additional glycines found inαLP, as compared with chymotrypsin, are predicted to contribute an extra �7 kcal/mol of configurational entropy (16) to the unfoldedstate at 4°C. Removing this additional entropy would be sufficient to alter the direction of the I and N equilibrium, placing the N stateat the global free-energy minimum.

The excess unfolding entropy may also be due in part to the extremely low conformational entropy of the αLP native structure.Although native states are often dynamic, αLP adopts a remarkably rigid native structure characterized by insensitivity to proteolysis,unusually low crystallographic B factors, and hydrogen-exchange protection factors on the order of >1010 for �40 core amides (J.Davis,J.L.Sohl, and D.A.A., unpublished data). Protection factors of this magnitude have never been observed in any other protein.Contributing to this rigidity, loops in αLP are generally shorter and therefore likely to be less flexible than those found in chymotrypsin.Many of αLP’s extra glycine residues facilitate the tight turns found in these condensed loops. With their ability to assume unusualbackbone geometries, the glycines may enable tighter and more cooperative packing within the protein core. In this manner, the highglycine content can reduce the entropy of the N state while increasing the configurational entropy of the I state.

Glycine content appears to be a common feature distinguishing homologous proteases to αLP that have pro regions from those thatdo not (Table 1). The Streptomyces griseus proteases, along with several other pro region-containing homologues, have 16–20%glycines, whereas the mammalian digestive enzymes and other members of the trypsin serine protease family without pro regions have6–12% glycines.

Evolution of Longevity Through Kinetic Stability. The correlation between high glycine content and the presence of aconserved β-hairpin in the protease and the coevolution of a pro region suggests that the rigid native state and large kinetic barrierfound in αLP may be conserved in other extracellular bacterial proteases. These shared properties may reflect their common function asproteases that break down microorganisms in the extracellular environment, supplying nutrients for their bacterial hosts. The utility ofthese proteases is compromised by their tendency to degrade themselves as well as other proteins. As such, it is presumably desirable tothe host to evolve proteases that can survive as long as possible under these harsh, degradatory conditions.

A typical protein stabilized thermodynamically without a large barrier preventing unfolding would constantly sample partially andfully unfolded states, leading to rapid destruction by exogenous proteases (Fig. 5a). By contrast, kinetic stability provides a mechanismto increase the cooperativity and raise the barrier to unfolding (Fig. 5b), thereby suppressing breathing motions and global unfolding.The result is a drastic reduction in susceptibility to proteolytic degradation.

Preliminary experiments indicate that this has indeed been a successful strategy for extending αLP’s lifetime when compared withits thermodynamically stabilized homologues chymotrypsin and trypsin. In a survival assay where these three proteases are mixed andallowed to attack each other, αLP retains its biological activity for much longer than its mammalian counterparts (Fig. 5c). Thesensitivity of trypsin and chymotrypsin to proteolysis is likely to be a necessary aspect of their regulation in vivo. Additionalexperiments demonstrate that the rate of αLP autolysis is comparable to the rate of its global unfolding, indicating that transientunfolding motions leading to proteolytic degradation have been suppressed. αLP has been so successfully optimized that it is vulnerableto degradation only after it completely unfolds, which occurs on an extremely slow time scale.

There is a price for kinetic stability, however. The evolution of a large barrier to unfolding and a highly rigid native state throughthe incorporation of glycines and other changes has, as a consequence, created an even larger barrier to folding and thermodynamicallydestabilized the native state of αLP. Nature’s solution has been the coevolution of a transient pro region to promote folding by bothreducing the folding barrier and stabilizing the native state. Although it is expected that the general principle of longevity throughkinetic stability will be shared by the majority of extracellular bacterial proteases and numerous eukaryotic proteases, the precise detailsof barrier height and degree of thermodynamic destabilization of the native state are likely to vary. αLP, with its large pro region andmetastable native state, may be an extreme example.

We thank Dr. Nicholas Sauter for helpful discussions. S.S.J. was supported by a Howard Hughes Medical Institute PredoctoralFellowship. D.A.A. is an Investigator of the Howard Hughes Medical Institute.1. Baker, D., Shiau, A.K. & Agard, D.A. (1993) Curr. Opin. Cell Biol. 5, 966–970.2. Brayer, G.D., Delbaere, L.T.J. & James, M.N.G. (1979) J. Mol Biol. 131, 743–775.3. Silen, J.L., Frank, D., Fujishige, A., Bone, R. & Agard, D.A. (1989) J. Bacteriol. 171, 1320–1325.4. Silen, J.L. & Agard, D.A. (1989) Nature (London) 341, 462–464.5. Baker, D., Sohl, J.L. & Agard, D.A. (1992) Nature (London) 356, 263–265.6. Sohl, J.L., Jaswal, S.S. & Agard, D.A. (1998) Nature (London) 395, 817–819.7. Fujinaga, M., Delbaere, L.T.J., Brayer, G.D. & James, M.N.G. (1985) J. Mol. Biol. 184, 479–502.8. Peters, R.J., Shiau, A.K., Sohl, J.L., Anderson, D.E., Tang, G., Silen, J.L. & Agard, D.A. (1998) Biochemistry 37, 12058–12067.9. Baker, D., Silen, J.L. & Agard, D.A. (1992) Proteins 12, 339–344.10. Sauter, N.K., Mau, T., Rader, S.D. & Agard, D.A. (1998) Nat. Struct. Biol. 5, 945–950.11. Anderson, D.E., Peters, R.J., Wilk, B. & Agard, D. (1999) Biochemistry 38, 4728–4735.

KINETIC STABILITY AS A MECHANISM FOR PROTEASE LONGEVITY 11013

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 61: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

12. Gewirth, D.T. & Sigler, P.B. (1995) Nat. Struct. Biol. 2, 386–394.13. Schirmer, T. & Evans, P.R. (1990) Nature (London) 343, 140–145.14. Royer, W.E.J., Pardanani, A., Gibson, Q.H, Peterson, E.S. & Friedman, J.M. (1996) Proc. Natl. Acad. Sci. USA 93, 14526– 14531.15. Gallagher, T., Gilliland, G., Wang, L. & Bryan, P. (1995) Structure (London) 3, 907–914.16. D’Aquino, J., Gomez, J., Hilser, V., Lee, K., Amzel, L. & Freire, E. (1996) Proteins 25, 143–156.17. Silen, J.L., McGrath, C.N., Smith, K.R. & Agard, D.A. (1988) Gene 69, 237–244.18. Sidhu, S.S., Kalmar, G.B., Willis, L.G. & Borgford, T.J. (1994) J. Biol. Chem. 269, 20167–20171.19. Shimoi, H., limura, Y., Obata, T. & Tadenuma, M. (1992) J. Biol. Chem. 267, 25189–25195.20. Sidhu, S.S., Kalmar, G.B., Willis, L.G. & Borgford, T.J. (1995) J. Biol. Chem. 270, 7594–7600.21. Sidhu, S.S., Kalmar, G.B. & Borgford, T.J. (1993) Biochem. Cell Biol 71, 454–461.22. Lao, G. & Wilson, D.B. (1996) Appl. Environ. Microbiol. 62, 4256–4259.23. Binnie, C., Liao, L., Walczyk, E. & Malek, L.T. (1996) Can. J. Microbiol. 42, 284–288.24. Henderson, G., Krygsman, P., Liu, C.J., Davey, C.C. & Malek, L.T. (1987) J. Bacteriol. 169, 3778–3784.25. Sohl, J.L., Shiau, A.K., Rader, S.D., Wilk, B. & Agard, D.A. (1997) Biochemistry 36, 3894–3902.

KINETIC STABILITY AS A MECHANISM FOR PROTEASE LONGEVITY 11014

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 62: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Cysteine protease inhibitors as chemotherapy: Lessons from aparasite target

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

PAUL M. SELZER*†, SABINE PINGEL‡, IVY HSIEH*, BERNHARD UGELE§, VICTOR J. CHAN*, JUAN C. ENGEL*, MATTHEW BOGYO¶, DAVID G.RUSSELL||, JUDY A. SAKANARI*, AND JAMES H. MCKERROW*,**,††

Departments of *Pathology, **Pharmaceutical Chemistry, ¶Biochemistry, and ‡Medicine, University of California, San Francisco,CA 94143; |Washington University, St. Louis, MO 63110; and §I.Frauenklinik, Klinikum Innenstadt, Ludwig-Maximilians UniversitätMünchen, 80337 Munich, Germany

ABSTRACT Papain family cysteine proteases are key factors in the pathogenesis of cancer invasion, arthritis, osteoporosis, and microbial infections. Targeting this enzyme family is therefore one strategy in the development of new chemotherapy for a number of diseases. Little is known, however, about the efficacy, selectivity, and safety of cysteine proteaseinhibitors in cell culture or in vivo. We now report that specific cysteine protease inhibitors kill Leishmania parasites in vitro, atconcentrations that do not overtly affect mammalian host cells. Inhibition of Leishmania cysteine protease activity wasaccompanied by defects in the parasite’s lysosome/endosome compartment resembling those seen in lysosomal storage diseases.Colocalization of anti-protease antibodies with biotinylated surface proteins and accumulation of undigested debris andprotease in the flagellar pocket of treated parasites were consistent with a pathway of protease trafficking from flagellar pocketto the lysosome/endosome compartment. The inhibitors were sufficiently absorbed and stable in vivo to ameliorate thepathology associated with a mouse model of Leishmania infection.

Leishmaniasis is a parasitic infection caused by various species of the protozoan Leishmania. Transmitted by the bite of sand flies,Leishmania infects 12 million people and is endemic in tropical regions of America, Africa, and the Indian subcontinent, as well as inthe subtropics of Southeast Asia and the Mediterranean. Three hundred and fifty million people live in areas where the disease iscommon, and large epidemics affecting hundreds of thousands have occurred as recently as 1991 (1). The severe visceral form ofleishmaniasis may also be an opportunistic disease in AIDS patients (1). The problem of leishmaniasis is compounded by theinadequacy of current chemotherapy. The first-line drugs are antimonial derivatives that were developed more than 40 years ago. Theyproduce serious side effects, and refractory cases are a problem. Second-line drugs are even more toxic, and require long, repeateddoses with close observation (1).

To address the need for new, cost-effective leads for the chemotherapy of leishmaniasis, we have applied strategies of structure-based drug design (2). An attractive target for new chemotherapy is a family of cathepsin L-like (cpL) and cathepsin B-like (cpB)cysteine proteases found inallspecies of Leishmania examined, and required for parasite growth or virulence (3–5). In studies withLeishmania mexicana, elimination of selected cysteine protease genes by homologous recombination showed that null mutants of thecpL gene array designated “cpb” had reduced virulence in highly susceptible BALB/c mice, and they produced no lesions atallinC57BL/6 or CBA/Ca mice (3, 4). Double null mutants of the cpL gene families “cpb” and “cpa” produced no lesions even in BALB/cmice (3). Deletion of the cpB gene “cpc” led to reduced survival of parasites in macrophages (3, 6). While structurally distinct,Leishmania cpL and cpB overlap in substrate specificity (2). Inhibitors that would effectively target both types of cysteine proteases inLeishmania, while maintaining some selectivity versus homologous host enzymes, would be ideal drug leads.

We have identified both reversible and irreversible cysteine protease inhibitors that meet these criteria. Reversible inhibitors werediscovered through a structure-based drug design screen and subsequent combinatorial synthetic optimization using models of bothLeishmania major cpB and cpL (2). The irreversible inhibitors are pseudopeptide substrate analogues that take advantage of the uniquereactivity of the active site sulfhydryl of cysteine proteases to confer specificity for this enzyme family but maintain activity againstboth cpL and cpB proteases (7, 8).

METHODS

Inhibitors. The reversible inhibitors ZLIII43A and ZLIII115A are derivatives of oxalic bis[(2-hydroxy-1-naphthyl)methylene]hydrazide, a cysteine protease inhibitor lead compound found in a computer graphics screen of the Fine Chemicals Directory (9).Several of the synthetic derivatives of that lead, produced by combinatorial synthetic chemistry, proved to be potent inhibitors ofhomologous cpLs of malaria (10) and the Leishmania cpB (2). The irreversible inhibitor used was the pseudopeptide substrate analoguemorpholine urea-phenylalanine-homophenylalanine-vinylsulfonyl-benzene (K11002, Arris Pharmaceuticals, South San Francisco, CA).Inhibitors were prepared as 20 mM stocks in dimethyl sulfoxide (DMSO) and stored at –20°C.

Protease Assays. The native L.major cpB was a gift of Jacques Bouvier (Novartis, St. Aubin, Switzerland). Papain (EC 3.4.22.2)and mammalian cathepsin B (bovine spleen; EC 3.4.22.1) were from Sigma. Recombinant cruzain was produced as previouslydescribed (11). All proteases were assayed at 25°C with an automated microtiter plate spectrofluorometer (Labsystem FluoroScan II;Northbrook, IL). Activity was detected by the liberation of 7-amino-4-methylcoumarin

†To whom reprint requests may be addressed at present address: Hoechst Roussel Vet GmbH, Research Pharmaceuticals, Building H811, D-65926 Frankfurt/Main, Germany. E-mail: [email protected].

††To whom reprint requests may be addressed at: Department of Pathology, University of California, Tropical Disease ResearchUnit, VAMC, 4150 Clement Street 113B, San Francisco, CA 94121. E-mail: [email protected].

PNAS is available online at www.pnas.org.Abbreviations: cpL, cathepsin L-like cysteine protease; cpB, cathepsin B-like cysteine protease; AMC, 7-amino-4-methylcoumarin.

CYSTEINE PROTEASE INHIBITORS AS CHEMOTHERAPY: LESSONS FROM A PARASITE TARGET 11015

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 63: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

(AMC) (excitation wavelength=355 nm and emission wave-length=460 nm) from the synthetic peptide substrate Z-Phe-Arg-AMC(Z=benzyloxycarbonyl) (Enzyme Systems Products, Livermore, CA). The enzyme concentrations were determined by active sitetitration. Reversible inhibitors at various concentrations were preincubated with the respective enzyme for 5 min before the reactionwas started by adding the substrate. Enzyme activities were expressed in percent of residual activity compared with an uninhibitedcontrol, and were plotted versus increasing inhibitor concentrations to calculate the IC50. Assay conditions were as follows: L.majorcpB: 100 mM sodium acetate at pH 5.5, 10 mM dithiothreitol (DTT), 1 mM EDTA, 0.1% Triton X-100, 50 µM Z-Phe-Arg-AMC finalconcentration (from a 10 mM stock solution in DMSO); Km=7 µM. Papain and mammalian cathepsin B: 100 mM sodium acetate at pH5.5, 10 mM DTT, 100 µM Z-Phe-Arg-AMC final concentration; Km=50 µM and 110 µM, respectively. Cruzain: The assay conditionswere the same as for papain except that the substrate concentration was 20 µM; Km=1 µM. Km values were determined by nonlinearregression using the software ULTRAFIT (Biosoft, Ferguson, MO). Irreversible inhibitors were assayed in a time-based inactivationassay. The inactivation process was based on the following scheme

where E=enzyme, I=inhibitor, EI=noncovalent enzyme inhibitor complex, E-I=inactivated enzyme, k1 and k–1= noncovalent rateconstants (Ki=k–1/k1), and kinact= first-order inactivation constant. The values of Ki and kinact were determined from progress curves inthe presence of substrate and inhibitor. These curves were fit to a first-order equation (ULTRAFIT) to produce kobs (observed inactivationconstant) values, where kobs=kinact[I]/Ki, app+[I], where Ki, app= apparent Ki). Plotting 1/kobs versus 1/[I] gives the values for Ki, app andkinact. Taking the substrate into consideration, the true Ki was calculated by Ki=Ki, app/(1+[S]/Km). At least six different inhibitorconcentrations were determined in duplicate for a minimum of three independent experiments. The reaction was started by adding theenzyme, and the time-dependent inactivation was monitored. Enzyme (E) and substrate (S) concentrations: L.major cpB, E=1–2 nM,S=2.5 µM; cruzain, E=5 nM, S=5 µM; papain, E=6 nM, S= 15 µM; and cpB, E=10 nM, S=10 µM.

Cell Culture Assays. L. major promastigotes LV39(MRHO/ SU/59/P) were grown at 27°C in 5 ml (25-cm2 cell culture flask;Costar, Cambridge, MA) of RPMI medium 1640 containing 10% (vol/vol) heat-inactivated fetal bovine serum (FBS) and 20% brainheart infusion tryptose. Parasites were maintained in the exponential growth phase by passing them twice a week. For inhibitor studies,106 cells per ml were inoculated in new cultures, and cell growth was determined by counting the parasites with a Neubauerhemocytometer (A.O. Instruments, Buffalo, NY). The mouse macrophage cell line J774 was maintained in 75-cm2 cell culture flasks(Costar) at 37°C in RPMI medium 1640 containing 5% FBS (12 ml total volume) and passed once a week. Irradiated J774 cells (10min, 2,700 rad, 24 h before infection) were cultured on glass coverslips in six-well cluster plates (Costar) and infected with stationary-phase promastigotes in a ratio of 1:10 for 12 h. After the infected macrophage monolayers had been washed three times with RPMI1640, inhibitors were added to the culture and plates were incubated for 5 days at 32°C in a 5% CO2/95% air atmosphere. To determinethe number of amastigotes per macrophage, cells were fixed in 100% methanol and stained with Giemsa stain. At least 200macrophages per experiment were examined to monitor the effect of the inhibitors. Inhibitors dissolved in DMSO were from 20 mMstock solutions. DMSO concentrations up to 0.5% showed no effect on promastigotes, amastigotes, or J774 cells.

Electron Microscopy and ImmunoGold Localization. One to 5×108 promastigote parasites, treated or untreated, were washedtwice with PBS (4°C, 10 min, 3,000 rpm in a Beckman Accuspin-FR centrifuge). Cells were fixed in 0.1 M sodium cacodylate buffer atpH 7.4 containing 1.5% glutaraldehyde (0.25% for ImmunoGold labeling) and 1% sucrose. Epon embedding, LR white embedding,and thin sectioning were performed according to standard protocols (12–14).

For ImmunoGold labeling, a polyclonal antiserum raised against the native L.major cpB was used in a 1:20 or 1:100 dilution,followed by a secondary antibody conjugated with 10-nm gold particles (goat antibody to rabbit IgG, 1:50, Amersham Life Sciences).Serum from the rabbit before immunization, BSA, and bovine serum were used for specificity controls. Photographs were taken with aZeiss EM10C.

Alternatively, promastigotes of L.major were surface labeled with 500 µg/ml N-hydroxysuccinimide-biotin in PBS (pH 7.6) for 20min on ice. The cells were washed and placed in medium at 25°C for 60 min. They were fixed in 200 mM Pipes with 4%paraformaldehyde, frozen, and processed for immunoelectron microscopy as described previously (15, 16). The thawed cryosectionswere probed with streptavidin (1 µg/ml), followed by mouse monoclonal anti-streptavidin and rabbit antibody to L.major cathepsin B.The antibodies were revealed by 12-nm gold-conjugated goat anti-mouse IgG and 18-nm gold-conjugated goat anti-rabbit IgG (JacksonImmunoResearch).

Promastigote Extracts and Western Blot Analysis. Five× 109 promastigotes were washed twice with PBS at pH 7.4. Cells weresonicated (Sonic Dismembranator 300, Fisher Scientific) on ice (three×10 sec, relative output 0.6) and adjusted with sodium acetatebuffer at pH 5.5 to 1×109 cells per ml. Aliquots were stored at –20°C for 4 months without any loss of cysteine protease activity.Samples of 100 µl were solubilized by adding 20 µl of 6-fold concentrated Laemmli buffer. Samples were subjected to SDS/10%PAGE and transferred to nitrocellulose sheets. The immunoblots were incubated in 2.5% (wt/vol) blocking reagent (BoehringerMannheim) in 100 mM maleic acid buffer (pH 7.5) for 60 min at room temperature, and then incubated overnight at 4°C with a rabbitpolyclonal antiserum raised against L.major cpB or L.mexicana cpL that had been diluted 1:1000 or 1:500, respectively, in 100 mMTris-HCl, pH 7.5, with 0.05% Tween 20 and 1% FCS. After incubation with horseradish peroxidase-conjugated secondary antibodies(1:3000; goat anti-rabbit IgG; Gibco BRL Life Technologies) for 60 min at room temperature, the blots were developed using ECL(Amersham Life Science). For active site labeling of cysteine proteases, promastigote extracts were incubated either with 50 µM 14C-labeled K11002 for 15 min at room temperature or with 125I-labeled p-nitrophenyl-derivatized E-64 and vinyl sulfone as previouslydescribed (17). Samples of 100 µl were subjected to SDS/PAGE and analyzed by fluorography.

Table 1. Inhibition of cysteine proteases with reversible inhibitors

IC50, µMEnzyme ZLIII115A ZLIII43AL.major cpB 10 2Cruzain 10 5Papain >50 10Mammalian cathepsin B 20 20

See Protease Assays for details of assay used.

Animal Model of Infection. All procedures were approved by the University of California, San Francisco Committee on

CYSTEINE PROTEASE INHIBITORS AS CHEMOTHERAPY: LESSONS FROM A PARASITE TARGET 11016

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 64: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Animal Research. Two×105 met acyclic L.major (WHOM/ IR/173) were obtained by peanut agglutinin selection as previouslydescribed (18) and were injected into each hind footpad of female BALB/c mice (18–20 g, Simonsen Laboratories, Gilroy, CA). StrainWHOM/IR/173 was used because of its higher virulence in mice compared with strain LV39(MRHO/SU/59/P). Compounds weredissolved in 100% DMSO and stored at –20°C. The final concentrations were adjusted with sterile water to give a 70:30 (vol/vol)DMSO/H2O mixture. Twenty-four hours after infection, mice were treated with 100 µl of K11002 or ZLIII115A (100 mg/kg per day,every day, 4 weeks, intraperitoneal) in either a single dose or split into two treatments per day. Each set tested consisted of five mice,including an untreated control, a DMSO-treated control, as well as uninfected mice treated with the appropriate compound. To monitorthe course of the infection, the thickness of the footpads was measured once a week by a standard method using a metric caliper (dialthickness gauge no. 7305; Mitutoyo, Kawasaki, Japan) (19). To quantify parasite burden, whole footpad histology as well as limitedparasite dilution assays from footpad tissues (20) were performed at the end of the experiments. To investigate side effects of thecompounds, mouse liver tissue was embedded in paraffin and 5-µm sections were stained with hematoxylin. Sections of treated tissueswere then compared with control liver tissue.

Table 2. Inhibition of cysteine proteases with K11002

Enzyme kinact, s–1 Ki, µM kinact/Ki, s–1.M–1

L.major cpB 0.021 ± 0.0014 0.205 ± 0.077 107,000 ± 32,000Cruzain 0.064 ± 0.027 0.17 ± 0.074 383,000 ± 27,000Papain 0.072 ± 0.009 0.261 ± 0.025 275,000 ± 10,000Mammalian cathepsin B 0.014 ± 0.001 9.8 ± 2.3 1,400 ± 250

Note the similar kinact values versus the differences in the Ki values that are mainly responsible for the divergence of the second-order rate constants. SeeProtease Assays for details of assay used.

Cytokine Assays. IL-4 and IFN-γ were assayed in inhibitor-treated and untreated mice by monoclonal-based ELISA andnormalized to standard controls as previously described (20).

RESULTS

The reversible hydrazide inhibitors, ZLIII115A and ZLIII43A, were tested against the L.major cpB, cruzain (the major cpL ofTrypanosoma cruzi), papain, and mammalian cathepsin B. There was 2- to 10-fold higher inhibitory activity toward the L. major cpBversus the plant or mammalian proteases (Table 1). In the case of the irreversible pseudopeptide inhibitor K11002 (7), the first-orderinactivation constant (kinact) was similar for the four enzymes. However, differences of up to 100-fold were observed for the second-order rate constants (kinact/Ki); (Table 2). The differences in activity of the mammalian cathepsin B versus the plant or parasite proteasesare mainly due to differences in Ki. The L.major cpB, cruzain, and papain were inhibited to a similar extent. This similarity is consistentwith the paradoxical cathepsin L-like substrate preference of the L. major cpB, due to a single amino acid modification in the S2binding pocket (2).

The three inhibitors were next tested in cell cultures of L. major promastigotes, the extracellular stage of the parasite. Inhibitorswere added to replicating Leishmania as a single dose, and cell growth was monitored for 3 days. Both the irreversible and thereversible inhibitors blocked replication of the parasite. Concentrations of 5 µM inhibited parasite growth 10-fold, whereas 20 µM and50 µM completely inhibited cell growth (Fig. 1). Exchanging the medium every day for a total of 3 days, thereby keeping inhibitorconcentrations stable (20 µM and 50 µM), led to death of the parasites. After the fourth day of this latter experiment, the medium wasreplaced with fresh medium without inhibitor, and the flasks were again kept under culture conditions. Even after 10 days no parasitescould be detected, indicating a complete cure of the Leishmania culture by the cysteine protease inhibitors.

FIG. 1. Effects of K11002 (A) and ZLIII115A (B) on the growth of L. major promastigotes. The compounds were added attime point 0, and cell growth was monitored for 3 days. ZLIII43A showed very similar inhibition profiles (2). ` , Control; ` ,5 µM; ` , 20 µM; and �, 50 µM. Points are means of three independent experiments.

To confirm that the inhibitors could access the intracellular cysteine proteases of L.major, promastigote parasites were harvestedafter treatment with 50 µM K11002 for 24 h and extracted. The residual cysteine protease activity as measured with the fluorogenicsubstrate Z-Phe-Arg-AMC (which de-tects both L.major cpL and cpB activity) in K11002-treated cells was 20%±7% (n=5) relative tothe control parasites. Targeting of cathepsins by K11002 was also confirmed by “tagging” the target proteases with radioactivelylabeled inhibitor and identification of protein species by parallel Western blot (Fig. 2). The predominant cysteine protease in L. majorpromastigotes is cpB (21), and this species was the predominant, but not exclusive, target of the vinyl sulfone inhibitor. The maturecatalytic domains of L.major cathepsins B and L

CYSTEINE PROTEASE INHIBITORS AS CHEMOTHERAPY: LESSONS FROM A PARASITE TARGET 11017

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 65: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

were labeled, as were intermediates in protease processing (Fig. 2).

FIG. 2. Inhibitor targets and Western blot analysis of promastigote lysates. Proteins from whole promastigote extracts (Lanes1 and 2), or whole promastigote were separated by SDS/PAGE and blotted to nitrocellulose membrane. Lane 1 was probedwith a rabbit antiserum raised against a cpL from L.mexicana (a gift of Jeremy Mottram, University of Glasgow); lane 2 wasprobed with a rabbit antiserum raised against cpB from L.major. Extracts in lane 3 were incubated with 125I-labeled E-64, anepoxide cysteine protease inhibitor. Extracts in lane 4 were labeled with 125I-vinyl sulfone as described previously (17). Notethe predominant labeling of the mature (catalytic domain) cpB in lanes 3 and 4 by both inhibitors. The vinyl sulfone alsobinds to the less abundant mature CpL. Both inhibitors label higher molecular weight protease precursors, which can betentatively identified as active intermediates in protease processing (Int.) by Western blotting (lanes 1 and 2) and reexpressionof specific protease genes in protease-null organisms (Sanya Sanderson and Jeremy Mottram, personal communication). The“complex” band is presumably an aggregate of protease with itself or with a carrier protein.

To determine effects of the inhibitors on Leishmania amastigotes, the stage of the parasite that resides within mammalian hostcells, irradiated macrophages (J774 cells) were infected with promastigote stationary-phase Leishmania. After 12 h, 50% of themacrophages were infected with 1 to 4 parasites per host cell. Cells were then treated with a single dose (40 µM) of inhibitor andcultured for another 5 days. At day 5, 85% of untreated J774 cells carried more than 9 parasites. This observation confirms thatparasites replicate within the host cells and infect new macrophages. Replication of parasites was decreased in cultures treated with thepseudopeptide inhibitor or the hydrazide inhibitors, and few if any new macrophages were infected after treatment (Table 3). Host cellmorphology was not affected by the treatment, and nonirradiated, inhibitor-treated macrophages had no difference in growth ratecompared with untreated cells.

Table 3. Treatment of amastigote parasitesCompound (40 µM) % of host cells infected % of host, cells with given number of amastigotes

1–4 5–9 >10Control 12 h 48 ± 9 48 ± 9 — —Control 5 d 85 ± 7 25 ± 5 22±5 38±7ZLIII115A 5 d 58 ± 9 58 ± 9 — —ZLIII43A 5 d 56 ± 12 56 ± 12 — —K11002 5 d 65 ± 7 58 ± 2 7±9 —

Irradiated J774 host cell macrophages were infected with L.major promastigotes for 12 h to allow infection of the cells and development of amastigoteparasites. After 12 h, 50% of the macrophages were infected. The established infection was then treated with a single dose of the hydrazide (ZL-) or thevinyl sulfone (K11002) protease inhibitors. After 5 days at 32°C, untreated (control) cells were highly infected, whereas treated cells remainedessentially unchanged with respect to the initial (already established) infection. Numbers are expressed in percent and are means±SD of threeindependent experiments.

The hydrazide inhibitors and the pseudopeptide inhibitor produced very similar effects on the organelle structure within theparasites. After 24 h of treatment, myelin figures, undigested cell debris, dense bodies, and multivesicular bodies

CYSTEINE PROTEASE INHIBITORS AS CHEMOTHERAPY: LESSONS FROM A PARASITE TARGET 11018

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 66: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

appeared within the abnormally dilated parasite lysosomes and the flagellar pocket (Fig. 3), the site where endocytosis and exocytosistakes place (22). These abnormalities resemble alterations seen in lysosomal storage diseases caused by the deficiency or absence ofspecific lysosomal hydrolases (23). The nucleus and the Golgi apparatus were not affected, but some cells showed dilatedmitochondria. In these latter cells the kinetoplast DNA was no longer condensed but appeared in diffuse patches. No effects on treatedmammalian host cells at the light microscopic or ultrastructural level were observed.

FIG. 3. Electron micrographs of Epon-embedded L.major promastigotes. Parasites were untreated (A) or treated for 24 h with50 µM K11002 (B and C) or 50 µM ZLIII115A (D). Treatment of the parasite with either inhibitor had very similar effects,resulting in the appearance of diverse multivesicular and dense bodies (arrowheads), lipid inclusions (arrows), and myelinfigures (asterisk), n, Nucleus; g, Golgi apparatus; f, flagellar pocket; m, mitochondrion; k, kinetoplast. (Bars=10 µm.)

To localize the Leishmania cysteine proteases within the parasite cell, ImmunoGold electron microscopic analysis using a L.majorcpB-specific antiserum was performed. In untreated cells the gold label appeared only in lysosomes (Fig. 4 C and D). Treated cellswere more heavily labeled in the dilated lysosome/endosome compartment and in the flagellar pocket (Fig. 4 E and F). Apparentlyempty flagellar pockets were also heavily labeled in treated parasites, but not in untreated parasites. To confirm target proteaselocalization at the site of inhibitor-induced abnormalities, untreated promastigotes were surface-labeled with N-hydroxysuccinimide-biotin and placed back in culture to facilitate internalization of labeled proteins. This method allows visualization of the endosomal/lysosomal network of the cells. ImmunoGold electron microscopy of these parasites revealed an abundance of cathepsin B in theflagellar pocket and in vesicles subtending that structure (Fig. 4 A and B). Some of these vesicles contained biotinylated proteins,indicating that they are endosomes or lysosomes, whereas others contained only cathepsin B, suggesting that they may be secretoryvesicles.

FIG. 4. Immunoelectron micrographs of L.major promastigotes. (A and B) Electron micrographs of cryosections fromL.major promastigotes that were surface biotinylated with N-hydroxysuccinimide-biotin and incubated in medium for 45 minprior-to fixation. The sections were probed with streptavidin/mouse anti-streptavidin mAb (12-nm gold particle conjugated togoat anti-mouse IgG) and rabbit anti-cathepsin B antibody (18-nm gold particle conjugated to goat anti-rabbit IgG). g, Golgiapparatus; e, endodome; n, nucleus; k, kinetoplast; m, mitochondrion. (Bars=0.25 µm.) (A) Cathepsin B label is observed invesicles in the vicinity of the Golgi apparatus and in the flagellar pocket, which is strongly positive for biotinylated proteins.(B) Label is also observed in biotin-positive vesicles, or endosomes that subtend the flagellar pocket. The data indicated thatcathepsin B is synthesized and proceeds through the Golgi apparatus into the secretory network, where it gains access to theflagellar pocket. (C–F) LR white-embedded and ImmunoGold-labeled (anti-L.major cpB antiserum) promastigotes. Note thespecific labeling in lysosomes of untreated cells (C and D) and in multivesicular bodies (arrowheads), as well as in theflagellar pocket of treated cells (E and F). (Bars=0.5 µm.)

Because of the selective arrest of parasite versus host cell growth by inhibitors added to cultures, the efficacy of the cysteineprotease inhibitors in vivo was evaluated in Leishmania-infected BALB/c mice. Twenty-four hours after infection, mice receivedintraperitoneal injections of K11002 or ZLIII115A dissolved in DMSO/H2O (70:30). By 2 weeks, control mice had already developedfootpad lesions, which progressed in size and severity. In treated animals the lesion development was significantly delayed, with noswelling of the footpads until 3–4 weeks (Figs. 5 and 6). After 4 weeks of treatment, inhibitor dosing was stopped, and lesiondevelopment paralleled that seen in control mice. At the end of the treatment period, whole footpad histology and limiting dilutionassays of parasites from extracted footpad tissues showed parasite burden for treated animals was at least two logs lower than that ofthe untreated animals (10–7 versus 10–5). This finding is consistent with results from previous studies that documented the correlationbetween footpad size and numbers of parasites (18, 20). None of the compounds produced toxic effects in mice, as indicated by dailyobservation of weight, activity, and appearance, as well as autopsy and histologic analysis. Also, there was no evidence of a switchfrom the usual TH2 cytokine response to Leishmania in

CYSTEINE PROTEASE INHIBITORS AS CHEMOTHERAPY: LESSONS FROM A PARASITE TARGET 11019

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 67: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

inhibitor-treated mice, as was reported by Maekawa et al. (24), who used the cathepsin B-specific inhibitor CA074. IFN-γ levelsremained unchanged (36.7±15.8 ng/ml for 106 cells, treated mice, versus 33.4±12.3 ng/ml for 106 cells, untreated mice), and IL-4 levelsremained elevated (14.4±1.0 ng/ml for 106 cells, treated mice, versus 16.7±2.1 ng/ml for 106 cells, untreated mice).

FIG. 5. Lesion development in Leishmania-infected BALB/c mice. The picture was taken after 3 weeks of treatment withK11002 (100 mg/kg per day). Note the erythema and gross edema of the footpads in the untreated mouse (left), versus noedema or erythema of the footpads in the protease inhibitor-treated mouse (right).

FIG. 6. Lesion sizes of treated and untreated infected BALB/c mice. Bars are means±SD of five mice (two footpads permouse). Data shown are from one representative experiment of three independent experiments.

DISCUSSION

The results of the studies presented here suggest that cysteine protease inhibitors can have selective therapeutic effects in diseasessuch as leishmaniasis, where exogenous (microbial) protease activity is targeted. By inference, the lack of any significant organ orsystemic toxicity of cysteine protease inhibitors also suggests that they may have utility in diseases where endogenous proteases arepresent in abnormal cellular or extracellular locations or at abnormally elevated levels.

Studies of cathepsin L family and cathepsin B family gene knockouts in L.mexicana (3, 4, 6) have suggested that at least two ofthe three cysteine protease gene families (cpa, cpb, cpc) would need to be eliminated to completely prevent parasite invasion orreplication in host cells and lesion development in vivo. The inhibition of both amastigote infection of macrophages and lesiondevelopment in mice by cysteine protease inhibitors reported here is comparable to, and consistent with, the results of double cysteineprotease gene knockout studies in L.mexicana (3). However, the effects seen with cysteine protease inhibitors on promastigotereplication, and the flagellar pocket-endosomal pathway abnormalities seen on ultrastructural analysis were not observed in theL.mexicana double gene knockout studies. The presence of undigested debris, including myelin figures, in lysosomes or endosomes hasbeen reported with storage diseases caused by absence of lysosomal hydrolases (23). One possibility is that, while each of the threegene families contributes to virulence of Leishmania (amastigote infection of macrophages and lesion development) in a gene dose-dependent manner, all three must be eliminated to affect promastigote replication and lysosome/endosomal function. Alternatively, theinhibitors may have prevented protease precursor processing (either autoproteolytic or by another of the three proteases) resulting in“retrograde” accumulation of unprocessed protease and organelle damage along a lysosome/endosome trafficking pathway (cf. Fig. 4).This condition would be analogous to the Golgi abnormality

CYSTEINE PROTEASE INHIBITORS AS CHEMOTHERAPY: LESSONS FROM A PARASITE TARGET 11020

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 68: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

observed in inhibitor-treated T.cruzi (25). In the case of L. major, the site of protease precursor processing must be later in thetrafficking pathway than that of T.cruzi, probably between the flagellar pocket and the lysosome/endosome compartment. Thelocalization of the protease in secretory vesicles destined for the flagellar pocket and in the pocket itself suggests this is one pathway bywhich protease may reach the lysosome. It is unclear whether the localization of cathepsin B to the flagellar pocket and subtendingendosomal compartments represents the major route of delivery to the lysosomes, or whether the proteinase is also delivered directly tomore mature lysosomal compartments. It is important to note that the ultrastructural abnormalities in flagellar pocket and lysosome/endosome were seen exclusively and consistently with cysteine protease inhibitors regardless of their specific chemistry (e.g., vinylsulfone versus dihydrazide). This observation suggests that the cellular alterations seen are due specifically to inhibition of the cysteinepro teases of Leishmania.

By labeled inhibitor studies, the predominant target of the cysteine protease inhibitors is Leishmania cpB (Fig. 2). However, thisconclusion probably reflects the fact that cpB is the most abundant species in L.major promastigotes. In fact, the inhibitors usedeffectively arrest the activity of both Leishmania cpB and cpL proteases when assayed against protease activity in either the aqueousphase or detergent phase of a Triton X-114 phase separation of promastigote extracts (P.M.S., unpublished data). Furthermore, theL.major cpB, while having sequence and structure homology to other members of the cpB family, has a substrate preference similar tothat of cpL because of the absence of a glutamic acid side chain at the base of the S2 binding pocket (2). Eighty percent of the totalLeishmania cysteine protease activity measured by the substrate Z-Phe-Arg-AMC could be inhibited by treatment of promastigoteswith 50 µM K11002 for 24 h. This was sufficient to halt parasite replication.

A final issue concerning the Leishmania cpB and the effects of cysteine protease inhibitors in vivo arises from the results of astudy by Maekawa et al. (24), who analyzed the effects of the cathepsin B-specific inhibitor CA074 on Leishmania infection in mice.Administration of this inhibitor to highly susceptible BALB/c mice resulted in a switch from the usual ineffectual TH2 cytokineresponse to a TH1 response that cleared the Leishmania infection. These authors concluded that inhibition of mammalian cathepsin Bby CA074 resulted in altered expression of Leishmania antigens on MHC class II cells, producing the cytokine shift. This does notappear to be the mechanism contributing to the clearance of parasites in our study. As reported by Maekawa et al. (24), and confirmedby our own assays, CA074 alone does not inhibit Leishmania replication in vitro even at concentrations above 20 µM (our results) and100 µM (24). Because CA074 does inhibit the Leishmania cpB in direct protease assays, these two results suggest that inhibition of asingle type of cysteine protease is insufficient to block parasite replication. It is consistent with the results of the null mutant studies onL.mexicana cysteine protease gene families (3, 4). On the other hand, administration of the vinyl sulfone inhibitor to mice in our studydid not result in a switch from TH2 to TH1 cytokines, as documented by direct measurements of IL-4 and IFN-γ levels. The vinylsulfone inhibitor is a less effective inhibitor of mammalian cathepsin B (Table 2), whereas CA074 is a very specific and effectiveinhibitor of both the Leishmania and mammalian cathepsin B. We therefore conclude that the vinyl sulfone inhibitor exerts its effect byinhibiting parasite replication, as was observed in in vitro assays (Fig. 1), by virtue of its ability to inhibit both cpB and cpL Leishmaniaproteases.

The lack of observed toxicity to either mammalian cells in culture or mice, at the concentrations or doses of cysteine proteaseinhibitors used in this study, is reassuring but in some ways surprising. Tables 1 and 2 indicate that selectivity of the inhibitors versusmammalian cathepsin B, for example, is significant for the vinyl sulfone compound, but relatively less for the dihydrazides.Nevertheless, neither compound produced a significant alteration in host cells at concentrations up to 50 µM, in terms of either cellreplication or ultrastructural appearance. The lack of toxicity at the doses used in mice is consistent with results of a similar study withvinyl sulfone inhibitors in the treatment of T.cruzi infection (26). We cannot rule out the possibility that inhibition of host cathepsin Sby the vinyl sulfone inhibitor might affect some aspect of antigen presentation. However, a range-finding toxicology study of K11002carried out at SRI International (Menlo Park, CA; Study M001–98, Project 1382–405, sponsored by the Developmental TherapeuticsBranch of the National Institute of Allergy and Infectious Diseases) found no abnormalities in standard clinical chemistry tests andconfirmed that toxicity (dsyspnea) in rats treated with this vinyl sulfone inhibitor was not seen until plasma concentrations of inhibitorexceeded 60 µM in males and 120 µM in females. The therapeutic plasma levels of inhibitor in the mouse study reported here (Figs. 5and 6) range between 5 and 19 µM (W.Jacobsen and L.Benet, personal communication).

The selectivity of the inhibitor effects on the parasite suggests that cysteine proteases are crucial to the parasite, whereas host cellsare less sensitive to cysteine protease inhibitors at the concentrations used. The lack of significant toxicity of cysteine proteaseinhibitors at the concentrations used in cell culture or achieved in mice may derive from several factors. First, parasites appear to takeup and concentrate inhibitor much more effectively than do host cell organelles (27). Host cells also have a redundancy of proteaseactivity not present in parasites. Even if one or more host cysteine proteases was inhibited, there may be little phenotypic effect.Finally, the concentration of proteases within host cells is substantially higher (millimolar) than that in parasites (28). Cultures ofL.major parasites can be cured with inhibitors that target cysteine proteases, and, for the first time, in vivo studies suggest that diseaseprogression can be reduced without toxicity to the host.

We thank Christopher Franklin and Elizabeth Hansell for excellent technical assistance, David Rasnick for advice on kineticanalysis, and Dan Friend for discussion of the ultrastructural studies. Jim Palmer (Arris Pharmaceuticals) kindly provided the vinylsulfone inhibitors. This work was supported by grants from the United Nations Development Programme/World Bank/World HealthOrganization Special Programme for Research and Training in Tropical Diseases (T21/ 181/29) to J.A.S., by the National Institutes ofHealth (AI35707) to J.H.M., and by the National Institutes of Health (AI37977) to D.G.R. J.H.M. is supported by a BurroughsWellcome Molecular Parasitology Scholar Award. P.M.S. was supported by a fellowship of the Deutsche Forschungsgemeinschaft (Se762/1–1). M.B. is a University of California San Francisco Fellow.1. World Health Organization (1993) UNDP/World Bank/WHO 8, Leishmaniasis, Special Programme for Research and Training in Tropical Disease.

Tropical Disease Research: Progress 1991– 1992. Eleventh Programme Report, pp. 77–87.2. Selzer, P.M., Chen, X., Chan, V.J., Cheng, M., Kenyon, G.L., Kuntz, I.D., Sakanari, J.A., Cohen, F.E. & McKerrow, J.H. (1997) Exp. Parasitol 87,

212–221.3. Mottram, J.C, Brooks, D.R. & Coombs, G.H. (1998) Curr. Opin. Microbiol 1, 455–460.4. Mottram, J.C., Souza, A.E., Hutchison, J.E., Carter, R., Frame, M.J. & Coombs, G.H. (1996) Proc. Natl. Acad. Sci. USA 93, 6008–6013.5. Coombs, G.H. & Baxter, J. (1984) Ann. Trap. Med. Parasitol. 78, 21–24.6. Bart, G., Frame, M.J., Carter, R., Coombs, G.H. & Mottram, J.C. (1997) Mol Biochem. Parasitol. 88, 53–61.7. Palmer, J.T., Rasnick, D., Klaus, J.L. & Bromme, D. (1995) J. Med. Chem. 38, 3193–3196.

CYSTEINE PROTEASE INHIBITORS AS CHEMOTHERAPY: LESSONS FROM A PARASITE TARGET 11021

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 69: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

8. Bromme, D., Klaus, J.L., Okamoto, K., Rasnick, D. & Palmer, J.T. (1996) Biochem. J. 315, 85–89.9. Ring, C.S., Sun, E., McKerrow, J.H., Lee, G.K., Rosenthal, P.J., Kuntz, I.D. & Cohen, F.E. (1993) Proc. Natl. Acad. Sci. USA 90, 3583–3587.10. Li, R., Chen, X., Gong, B., Selzer, P.M., Li, Z., Davidson, E., Kurzban, G., Miller, R.E., Nuzum, E.O., McKerrow, J.H., et al (1996) Bioorg. Med.

Chem. 4, 1421–1427.11. Eakin, A.E., Harth, G., McKerrow, J.H. & Craik, C.S. (1992) J. Biol. Chem. 267, 7411–7420.12. Tokuyasu, K.T. (1986) J. Microsc. 143, 139–149.13. Selzer, P.M., Webster, P. & Duszenko, M. (1991) Eur. J. Cell Biol. 56, 104–112.14. Bannister, L.H. & Kent, A.P. (1993) Methods Mol. Biol. 21, 415–429.15. Russell, D.G., Xu, S. & Chakraborty, P. (1992) J. Cell Sci. 103, 1193–1210.16. Russell, D.G. (1994) Methods Cell Biol. 45, 277–288.17. Bogyo, M., Shin, S., McMaster, J.S. & Plough, H.L. (1998) Chem. Biol. 5, 307–320.18. Sacks, D.L., Hieny, S. & Sher, A. (1985) J. Immunol. 135, 564–569.19. Heinzel, F.P., Sadick, M.D. & Locksley, R.M. (1988) Exp. Parasitol. 65, 258–268.20. Fowell, D.J., Magram, J., Turck, C.W., Killeen, N. & Locksley, R.M. (1997) Immunity 6, 559–569.21. Sakanari, J.A., Nadler, S.A., Chan, V.J., Engel, J.C., Leptak, C. & Bouvier, J. (1997) Exp. Parasitol 85, 63–76.22. Overath, P., Stierhof, Y.D. & Wiese, M. (1997) Trends Cell Biol 7, 27–33.23. Ghadially, F.N. (1988) in Ultrastructural Pathology of the Cell and Matrix, ed. Ghadially, F.N. (Butterworths, London), pp. 589– 765.24. Maekawa, Y., Himeno, K., Ishikawa, H., Hisaeda, H., Sakai, T., Dainichi, T., Asao, T., Good, R.A. & Katunuma, N. (1998) J. Immunol 161, 2120–

2127.25. Engel, J.C., Doyle, P.S., Palmer, J., Hsieh, I., Bainton, D.F. & McKerrow, J.H. (1998) J. Cell Sci. III, 597–606.26. Engel, J.C., Doyle, P.S. Hsieh, I. & McKerrow, J.H. (1998) J. Exp. Med. 188, 725–734.27. McGrath, M.E., Eakin, A.E., Engel, J.C., McKerrow, J.H., Craik, C.S. & Fletterick, R.J. (1995) J. Mol. Biol. 247, 251–259.28. Xing, R., Addington, A.K. & Mason, R.W. (1998) Biochem. J. 332, 499–505.

CYSTEINE PROTEASE INHIBITORS AS CHEMOTHERAPY: LESSONS FROM A PARASITE TARGET 11022

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 70: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

How the protease thrombin talks to cells

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

SHAUN R. COUGHLIN*Cardiovascular Research Institute and Departments of Medicine and Cellular and Molecular Pharmacology, University of

California, San Francisco, CA 94143–0130ABSTRACT How does a protease act like a hormone to regulate cellular functions? The coagulation protease thrombin

(EC 3.4.21.5) activates platelets and regulates the behavior of other cells by means of G protein-coupled protease-activatedreceptors (PARs). PAR1 is activated when thrombin binds to and cleaves its amino-terminal exodomain to unmask a newreceptor amino terminus. This new amino terminus then serves as a tethered peptide ligand, binding intramolecularly to thebody of the receptor to effect transmembrane signaling. The irreversibility of PAR1’s proteolytic activation mechanism standsin contrast to the reversible ligand binding that activates classical G protein-coupled receptors and compels special mechanismsfor desensitization and resensitization. In endothelial cells and fibroblasts, activated PAR1 rapidly internalizes and then sorts tolysosomes rather than recycling to the plasma membrane as do classical G protein-coupled receptors. This trafficking behavioris critical for termination of thrombin signaling. An intracellular pool of thrombin receptors refreshes the cell surface withnaïve receptors, thereby maintaining thrombin responsiveness. Thus cells have evolved a trafficking solution to the signalingproblem presented by PARs. Four PARs have now been identified. PAR1, PAR3, and PAR4 canallbe activated by thrombin.PAR2 is activated by trypsin and by trypsin-like proteases but not by thrombin. Recent studies with knockout mice, receptor-activating peptides, and blocking antibodies are beginning to define the role of these receptors in vivo.

Among their myriad roles, extracellular proteases can function like hormones to regulate cellular behaviors. Perhaps the best-studied example of such a process is activation of platelets by the coagulation protease thrombin (EC 3.4.21.5). This article brieflyreviews our current understanding of the receptors that mediate protease signaling in platelets and other cells and points out some of theinteresting questions they raise.

How Does a Protease Talk to a Cell?

Because platelets and thrombin are important in myocardial infarction and other thrombotic processes, understanding howthrombin activates platelets has long been an important goal (1). How does thrombin talk to platelets? Thrombin signaling is mediatedat least in part by a family of G protein-coupled protease-activated receptors (PARs), for which PAR1 is the prototype (2, 3). Thrombinactivates PAR1 by binding to and cleaving its amino-terminal exodomain to unmask a new receptor amino terminus (2). This newamino terminus then serves as a tethered peptide ligand, binding intramolecularly to the body of the receptor to effect transmembranesignaling (Fig. 1) (2, 4, 5). The synthetic peptide SFLLRN, which mimics the first six amino acids of the new amino terminusunmasked by receptor cleavage, functions as an agonist for PAR1 and activates the receptor independently of thrombin and proteolysis(2, 6, 7). Beyond supporting the tethered ligand model of receptor activation, such peptides have been useful as agonists for probingPAR function in various cell types and as a starting point for antagonist development.

FIG. 1. Mechanism of PAR1 activation. Thrombin (large sphere) recognizes the amino-terminal exodomain of the G protein-coupled thrombin receptor PAR1. This interaction utilizes sites both amino-terminal (P1–P4, small sphere) and carboxyl-terminal (P9�–P14�, small oval) to the thrombin cleavage site. Thrombin cleaves the peptide bond between receptor residuesArg-41 and Ser-42. This serves to unmask a new amino terminus beginning with the sequence SFLLRN (diamond) thatfunctions as a tethered ligand, docking intramolecularly with the body of the receptor to effect transmembrane signaling.hPAR1, human PAR1; the asterisk indicates the activated form. Synthetic SFLLRN peptide will function as an agonist,bypassing the requirement for receptor cleavage.

Thus PAR1 is a peptide receptor that carries its own ligand. The ligand remains hidden until it is revealed by selective cleavage ofPAR1’s amino-terminal exodomain. This proteolytic switch removes amino-terminal sequence that sterically hinders ligand functionand generates a new protonated amino group at the amino terminus created by receptor cleavage. In the SFLLRN peptide, the cognateprotonated amino group is critical for agonist activity (7, 8). Parallels with zymogen activation in serine proteases are apparent (2, 9). Inconversion of trypsinogen to trypsin, precise proteolytic cleavage generates a new amino terminus that bears a new protonated aminogroup, which then docks intramolecularly to trap the protease in its active conformation (9).

Irreversible Activation, Disposable Receptors, and Intracellular Reserves

The mechanism of PAR1 activation is strikingly irreversible. Cleavage of PAR1 by thrombin is irrevocable, and the tetheredligand generated cannot diffuse away from the receptor. In the absence of the reversible ligation that characterizes most receptorsystems, how is PAR1 shut off? The β2-adrenergic receptor has served as a prototype for dissecting the molecular events responsiblefor G protein-coupled receptor desensiti-

*To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviation: PAR, protease-activated receptor.

HOW THE PROTEASE THROMBIN TALKS TO CELLS 11023

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 71: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

zation and resensitization (10–13). Upon activation, β2-adrenergic receptor is rapidly phosphorylated. It then binds arrestin, preventingfurther interaction with G proteins. Arrestin also mediates internalization of β2-adrenergic receptors via clathrin-coated pits (14, 15).Within an endosomal compartment, receptors dissociate from ligand, are dephosphorylated, and recycle back to the cell surfacecompetent to signal again. Thus trafficking serves to remove activated β2-adrenergic receptors from the cell surface and to return thereceptors to the surface in an off state, ready to respond again to ligand.

Like the β2-adrenergic receptor, PAR1 is rapidly phosphorylated and uncoupled from signaling after activation (16, 17). PAR1 isalso internalized after activation (18–20). However, instead of efficiently recycling after internalization, activated PAR1 sortspredominantly to lysosomes (18, 19, 21). Indeed, in transfected fibroblast cell lines, activation decreased the half-life of PAR1 from 8hr to 30 min (22). Recent studies that employed chimeras between PAR1 and the substance P receptor were informative regarding therole of PAR1’s distinct sorting pattern in signal termination (22, 23). Wild-type substance P receptor internalized and recycled afteractivation like β2-adrenergic receptor; PAR1 bearing the substance P receptor’s cytoplasmic tail (P/S) behaved similarly. By contrast,wild-type PAR1 and a substance P receptor bearing PAR1’s cytoplasmic carboxyl tail (S/P) sorted to lysosomes after activation.Consistent with these observations, PAR1 and the S/P chimera were effectively down-regulated by their respective agonists as assessedby both receptor protein levels and signaling. By contrast, substance P receptor and the P/S chimera showed little down-regulation.Strikingly, cells expressing the P/S chimera signaled indefinitely after exposure to thrombin, apparently due to “resignaling” by cleavedand activated thrombin receptors returning to the cell surface (23). These data suggest that the cytoplasmic tails of PAR1 and substanceP receptor specify distinct intracellular sorting patterns in a single cell type. More importantly, the “irreversible” thrombin signalingseen in cells expressing the P/S chimera suggests that lysosomal sorting is indeed necessary to prevent persistent signaling by activatedPAR1.

When some cell types were exposed to thrombin for a prolonged period, a steady-state level of cleaved receptors was detected onthe cell surface (16, 18). In such a state, cells were refractory to thrombin but responded to the PAR1-activating peptide SFLLRN (16,18). Such responses were mediated by a subset of PAR1 molecules in which the tethered ligand was modified or otherwise preventedfrom functioning (24, 25). The significance of this phenomenon is unclear; it may represent a mechanism for dealing with the minorityof activated PAR1 molecules that escape sorting to lysosomes.

Termination of PAR1 signaling thus occurs at several levels. The initial uncoupling of PAR1 depends on phosphorylation and mayinvolve arrestin binding, as for other G proteincoupled receptors. Activated PAR1 is prevented from recycling and “resignaling” mainlyby its sorting to lysosomes—a trafficking solution to a signaling problem. Such mechanisms for maintaining the temporal fidelity ofthrombin signaling are presumably important in fibroblasts and vascular endothelial cells; both cell types express PAR1 and may needto respond to thrombin accurately over time.

While assuming special significance in the case of proteolytically activated PAR1, internalization and degradation of activatedreceptors is important for long-term down-regulation in many receptor systems. PAR1 may be useful as a model system forcharacterizing this sorting process in mammalian cells.

The finding that each PAR1 molecule is used once and discarded raises the question of how cells maintain responsiveness tothrombin over time. In fibroblasts and endothelial cells, unactivated PAR1 appears to cycle slowly between the cell surface and anintracellular compartment, such that at steady state approximately one-half of PAR1 molecules are inside the cell and protected fromthrombin cleavage (19, 21). This intracellular “reserve” can repopulate the cell surface with naïve receptors without new receptorsynthesis, thereby restoring or maintaining responsiveness to thrombin. Slow agonist-independent internalization of PAR1 is requiredfor maintaining this intracellular reserve (20, 26). Hence, the irreversibility of PAR1’s proteolytic activation mechanism isaccommodated by special desensitization and resensitization machinery. Like recycling and lysosomal sorting, tonic and agonist-triggered internalization of PAR1 were separable by mutation (20, 26). This observation suggests that distinct machinery mayrecognize naïve vs. activated PAR1 and that elucidating the molecular basis for PAR1’s trafficking behavior might reveal newmechanisms.

A Protease-Activated Receptor Family

Recognition and cleavage of PAR1 by thrombin is specified by two short stretches of amino acids in PAR1’s amino-terminalexodomain. LDPR/S binds thrombin’s active center, and the “hirudin-like” sequence DKYEPF binds thrombin’s fibrinogen-bindingexosite (4, 27–30). Thrombin’s role in activating PAR1 appears limited to cleaving the receptor (4, 30). Indeed, replacing the PAR1thrombin cleavage site LDPR/S with the enteropeptidase cleavage site DDDDK/S produced a receptor that signaled to enteropeptidasebut not thrombin (4). A trypsin cleavage site was similarly effective (25). It is noteworthy that such a discrete sequence dictatesreceptor specificity. One might expect that it would be relatively easy to generate a family of receptors with distinct proteasespecificities once one protease-activated receptor had evolved.

FIG. 2. Protease-activated receptor family. Four PARs are known. Amino acid sequence identity between human (h-) andmouse (m-) homologues of each is approximately 60%, but identity between different PARs within a single species falls toapproximately 30%. Xen indicates Xenopus. Human PAR1, PAR3, and PAR4 can be activated by thrombin, and sensingthrombin is likely, at least in part, their role in vivo (see text). One receptor, PAR2, is activated by trypsin and tryptase but notby thrombin. Its roles in vivo remain to be explored. The four PAR genes share a common two-exon structure. In essence, thefirst exon encodes a signal peptide and the second the mature receptor protein. The genes encoding PARs 1, 2, and 3 areadjacent in the mouse and human genomes, whereas the PAR4 gene resides at a separate location (32, 65, 66).

Four PARs are now known (Fig. 2). PAR1, PAR3, and PAR4 are thrombin receptors (2, 3, 31–33). PAR1 and human PAR3respond to thrombin at subnanomolar concentrations (2, 3, 31, 33). PAR4 requires higher but probably still physiological levels ofthrombin for activation (see below) (32, 33), perhaps because it lacks the hirudin-like thrombin-binding

HOW THE PROTEASE THROMBIN TALKS TO CELLS 11024

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 72: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

sequence that is present in PAR1 and PAR3. PAR2 is activated by trypsin and tryptase, not by thrombin (34, 35).It is interesting to note that a Xenopus thrombin receptor (36) is clearly identifiable as a PAR1 homologue (Fig. 2), suggesting that

several PAR genes may have existed before amphibians and mammals diverged. How and in what context did PARs evolve? It wasrelatively easy to “evolve” a tethered ligand in vitro for the formyl peptide receptor (37). However, the identity of the common ancestorof PARs and other G protein-coupled receptors, the temporal relationship of the appearance of PAR genes vs. that of various proteasecascades, and the function of the first PAR are unknown.

Given the importance of thrombin and platelets in myocardial infarction and other thrombotic processes, identification of thereceptors responsible for thrombin signaling in platelets has been a high priority. Recent studies outlined below provide a model for theroles of the known PARs in this process. The roles of PARs in other cell types and processes are just beginning to be explored.

PARs and Platelet Activation

Our understanding of the role of PARs in platelet activation is evolving rapidly. PAR1 mRNA and protein were detected in humanplatelets (2, 38–40). PAR1-activating peptides activated human platelets (2, 6, 7). PAR1-blocking antibodies inhibited human plateletactivation by low but not high concentrations of thrombin (38, 39). These data suggested a role for PAR1 in activation of humanplatelets by thrombin but held open the possibility that other receptors contribute. Curiously, in mouse platelets, PAR1 appeared to playno role. PAR1 expression was difficult to detect and PAR1-activating peptides did not activate rodent platelets (41–43). Moreover,platelets from PAR1-deficient mice responded like wild-type platelets to thrombin (43). The latter observation prompted a search foradditional thrombin receptors and led to the identification of PAR3 (31). PAR3 was indeed expressed in mouse platelets (31) but couldnot be detected in human platelets (44). Inhibition of PAR3 function with antibodies that bound to PAR3’s hirudin-like domain or bygene knockout prevented mouse platelet activation by low but not high concentrations of thrombin (33, 45). These results establishedthat PAR3 is necessary for normal thrombin signaling in mouse platelets but also pointed to the existence of another platelet thrombinreceptor. Such a receptor, PAR4, was recently identified (32, 33). PAR4 appears to function in both mouse and human platelets (32, 33,44). Thus in both mouse and human, platelets utilize two thrombin receptors. A “high-affinity” thrombin receptor (PAR1 in human,PAR3 in mouse) is necessary for responses to low concentrations of thrombin, whereas a “low-affinity” receptor (PAR4 in bothspecies) mediates responses at higher concentrations of thrombin. Do these receptors account for thrombin activation of platelets?Addressing this question at the genetic level awaits generation of a mouse deficient in both PAR3 and PAR4. In the meantime,pharmacological studies of human platelets suggest that the answer might be yes (44). Inhibition of PAR1 function alone—whether byblocking antibody, antagonist, or desensitization—inhibited platelet responses at 1 nM thrombin but only slowed responses at 30 nMthrombin. Inhibition of PAR4 function alone with a blocking antibody had no effect at either concentration. Strikingly, combinedinhibition of PAR1 and PAR4 signaling profoundly inhibited platelet responses even at high concentrations of thrombin (44).

Available data suggest that PAR4 activation is not necessary for robust responses in human platelets when PAR1 function isintact. Why do platelets have two receptors? Aside from providing a backup signaling device, PAR4 might allow platelets to respond toproteases other than thrombin, mediate thrombin signaling to distinct effectors or with a tempo different from that of PAR1, or functionin platelet responses beyond simple secretion and aggregation. The existence of two genes and gene products also raises the possibilityof differential regulation at many levels in platelets or other cell types. Most interestingly, it is possible that PARs interact. These issuesremain to be explored.

The identification of the receptors that mediate platelet activation by thrombin raises important questions regarding strategies forthe development of antithrombotic therapies. Clearly PAR antagonists can be developed (44, 46). The observation that PAR1 inhibitionblocked platelet responses to low concentrations of thrombin and slowed responses to high concentrations raises the question ofwhether PAR1 inhibition alone might be sufficient for an antithrombotic effect (44, 47). Alternatively, it may be necessary to blockboth PAR1 and PAR4 to prevent or arrest thrombosis in vivo. Whether such strategies should be pursued can now be determined byusing receptor blocking reagents in appropriate animal models.

A Role for Thrombin Signaling in Embryonic Development and Other Processes?

The role of PARs in cell types other than platelets is under active investigation in a number of laboratories. Several attractivehypotheses focus on possible roles for PARs in protease signaling to the blood vessel wall. In the adult, PAR1 is expressed by vascularendothelial cells and smooth muscle cells and is thus opportunely positioned to mediate communication between blood and the cellscomprising the vessel wall. In cell culture, thrombin causes endothelial cells to deliver the leukocyte adhesion molecule P-selectin totheir surfaces (48), to secrete von Willebrand factor (48), to elaborate growth factors and cytokines (49, 50), and to change shape andincrease permeability (51). Thrombin is also a mitogen for fibroblasts (52) and vascular smooth muscle cells (53) and has a variety ofmetabolic effects on these cells. Vascular injury in any form, whether metabolic, mechanical, immune-mediated, or infectious, is likelyto promote local thrombin generation at some level. These considerations prompt the hypothesis that thrombin might participate inacute and/or chronic inflammatory and proliferative responses to vascular injury. One might also imagine a role for thrombin signalingin the setting of angiogenesis, where leaky nascent vessels might trigger local thrombin activity. PAR-deficient mice will be invaluablefor testing such hypotheses.

We are particularly interested in the role of PAR1 in embryonic development because it may reveal unanticipated roles for thecoagulation cascade that are independent of platelet activation and fibrin formation. Approximately half of PAR1-deficient embryos diebetween embryonic days 9.5 and 10.5 (43, 54). Histological examination of these embryos revealed embryonic blood cells in thepericardial, amniotic, and exocoelomic cavities, suggesting a defect in hemostatic mechanisms or vascular integrity (C.Griffin andS.R.C., unpublished results). Deficiency of pro thrombin or factor V, which is necessary for thrombin generation, caused grosslysimilar developmental defects (55–57). Although one might ascribe bleeding in these knockouts to failed fibrin generation and/orplatelet activation, fibrinogen (58) and platelets (59) are not necessary for normal embryonic development. Moreover, PAR1 is notexpressed in mouse platelets, at least in the adult, and platelets from the PAR1-deficient mice that survived to adulthood had no defectin their response to thrombin (43). The relationships of the developmental phenotypes of PAR1, factor V, and prothrombin deficiencyhave not been formally tested, and it is certainly possible, even likely, that thrombin acts on targets other than PAR1 and/or that PAR1has activators other than thrombin during development. Nonetheless, it is tempting to postulate that the “vascular integrity

HOW THE PROTEASE THROMBIN TALKS TO CELLS 11025

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 73: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

defect” common to PAR1-, prothrombin-, and factor V-deficient embryos is due at least in part to defective thrombin signaling in cellsother than platelets. Although PAR1 is expressed in a variety of cell types at embryonic day 9.5 (E9.5), in situ hybridization of E9.5embryos revealed PAR1 mRNA to be most abundant in endothelial cells (ref. 60 and data not shown). This prompts the workinghypothesis that PAR1 signaling in endothelial cells is important for normal vascular development. Thrombin generation is triggeredwhen factor VIIa in plasma meets extravascular tissue factor, hence the coagulation protease cascade can be viewed in part as a “leakdetector” for blood vessels. Perhaps developing blood vessels use this system to monitor their functional status as they grow andremodel. Studies designed to test the role of endothelial PAR1 in vascular development are ongoing.

Summary

PARs provide one mechanism by which proteases can act as hormones and talk directly to cells. In PARs, nature has utilized amechanism analogous to zymogen activation to trigger ligation of a G protein-coupled receptor. The irreversibility of this activationmechanism poses an unusual problem for receptor desensitization and resensitization, a problem solved by specialized receptortrafficking. Such trafficking bells and whistles raise the question of how long PARs have had to evolve and how broad their spectrumof activities might be. Four PARs are now known. Given the myriad of membrane-anchored and soluble extracellular proteases, itwould not be surprising if more existed. Indeed, because only a few amino acids in their amino-terminal exodomains dictate thespecificity of PARs for their activating proteases, one might predict that new PARs with new protease specificities might “easily”evolve. Thrombin’s cellular actions motivated the search for PAR1 (2, 3) and descriptions of cellular responses to trypsin that wereindependent of PAR1 presaged the identification of PAR2 (61). Cathepsin G and tissue factor/ VIIa each elicit interesting signalingphenomena (62–64), as do a variety of other proteases; whether known or new PARs will account for such signaling remains to bedetermined. Similarly, defining the roles of the known PARs in vivo in normal and disease states remains an important challenge. Thesereceptors have already provided useful insights into regulation of platelet function and are likely to provide surprises regarding theregulatory roles of proteases in other cell types and processes.1. Davey, M. & Luscher, E. (1967) Nature (London) 216, 857–858.2. Vu, T.-K. H., Hung, D.T., Wheaton, V.I. & Coughlin, S.R. (1991) Cell 64, 1057–1068.3. Rasmussen, U.B., Vouret-Craviari, V., Jallat, S., Schlesinger, Y., Pages, G., Pavirani, A., Lecocq, J.P., Pouyssegur, J. & Van Obberghen-Schilling, E.

(1991) FEBS Lett. 288, 123–128.4. Vu, T.-K. H, Wheaton, V.I., Hung, D.T. & Coughlin, S.R. (1991) Nature (London) 353, 674–677.5. Chen, J., Ishii, M., Wang, L., Ishii, K. & Coughlin, S.R. (1994) J. Biol Chem. 269, 16041–16045.6. Vassallo, R.J., Kieber, E.T., Cichowski, K. & Brass, L.F. (1992) J. Biol Chem. 267, 6081–6085.7. Scarborough, R.M., Naughton, M.A., Teng, W., Hung, D.T., Rose, J., Vu, T.K., Wheaton, V.I., Turck, C.W. & Coughlin, S.R. (1992) J. Biol. Chem.

267, 13146–13149.8. Coller, B.S., Ward, P., Ceruso, M., Scudder, L.E., Springer, K., Kutok, J. & Prestwich, G.D. (1992) Biochemistry 31, 11713– 11720.9. Bode, W., Schwager, P. & Huber, R. (1978) J. Mol Biol. 118, 99–112.10. Yu, S.S., Lefkowitz, R.J. & Hausdorff, W.P. (1993) J. Biol. Chem. 268, 337–341.11. Krueger, K.M., Daaka, Y., Pitcher, J.A. & Lefkowitz, R.J. (1997) J. Biol Chem. 272, 5–8.12. Lohse, M., Benovic, J., Codina, J., Caron, M. & Lefkowitz, R. (1990) Science 248, 1547–1550.13. Freedman, N.J. & Lefkowitz, R.J. (1996) Recent Prog. Horm. Res. 51, 319–351; Discussion 352–353.14. Ferguson, S.S., Downey, W.R., Colapietro, A.M., Barak, L.S., Menard, L. & Caron, M.G. (1996) Science 271, 363–366.15. Goodman, O.J., Krupnick, J.G., Santini, F., Gurevich, V.V., Penn, R.B., Gagnon, A.W., Keen, J.H. & Benovic, J.L. (1996) Nature (London) 383,

447–450.16. Ishii, K., Hein, L., Kobilka, B. & Coughlin, S.R. (1993) J. Biol. Chem. 268, 9780–9786.17. Ishii, K., Chen, J., Ishii, M., Koch, W.J., Freedman, N.J., Lefkowitz, R.J. & Coughlin, S.R. (1994) J. Biol Chem. 269, 1125–1130.18. Hoxie, J.A., Ahuja, M., Belmonte, E., Pizarro, S., Parton, R. & Brass, L.F. (1993) J. Biol Chem. 268, 13756–13763.19. Hein, L., Ishii, K., Coughlin, S.R. & Kobilka, B.K. (1994) J. Biol Chem. 269, 27719–27726.20. Shapiro, M.J., Trejo, J., Zeng, D.W. & Coughlin, S.R. (1996) J. Biol. Chem. 271, 32874–32880.21. Woolkalis, M.J., DeMelfi, T.J., Blanchard, N., Hoxie, J.A. & Brass, L.F. (1995) J. Biol Chem. 270, 9868–9875.22. Trejo, J., Hammes, S.R. & Coughlin, S.R. (1998) Proc. Natl. Acad. Sci. USA 95, 13698–13702.23. Trejo, J. & Coughlin, S.R. (1999) J. Biol. Chem. 274, 2216–2224.24. Trejo, J., Connolly, A.J. & Coughlin, S.R. (1996) J. Biol. Chem. 271, 21536–21541.25. Hammes, S.R. & Coughlin, S.R. (1999) Biochemistry 38, 2486–2493.26. Shapiro, M.J. & Coughlin, S.R. (1998) J. Biol. Chem. 273, 29009–29014.27. Liu, L., Vu, T.-K. H., Esmon, C.T. & Coughlin, S.R. (1991) J. Biol Chem. 266, 16977–16980.28. Mathews, I. L, Padmanabhan, K.P., Ganesh, V., Tulinsky, A., Ishii, M., Chen, J., Turck, C.W., Coughlin, S.R. & Fenton, J.N. (1994) Biochemistry

33, 3266–3279.29. Hung, D.T., Vu, T.-K. H., Wheaton, V.I., Charo, I.F., Nelken, N.A., Esmon, C.T. & Coughlin, S.R. (1992) J. Clin. Invest. 89, 444–450.30. Ishii, K., Gerszten, R., Zheng, Y.-W., Turck, C.W. & Coughlin, S.R. (1995) J. Biol Chem. 270, 16435–16440.31. Ishihara, H., Connolly, A.J., Zeng, D., Kahn, M.L., Zheng, Y.W., Timmons, C., Tram, T. & Coughlin, S.R. (1997) Nature (London) 386, 502–506.32. Xu, W.F., Andersen, H., Whitmore, T.E., Presnell, S.R., Yee, D.P., Ching, A., Gilbert, T., Davie, E.W. & Foster, D.C. (1998) Proc. Natl. Acad. Sci.

USA 95, 6642–6646.33. Kahn, M.L., Zheng, Y.W., Huang, W., Bigornia, V., Zeng, D., Moff, S., Farese, R.V., Jr., Tam, C. & Coughlin, S.R. (1998) Nature (London) 394,

690–694.34. Nystedt, S., Emilsson, K., Wahlestedt, C. & Sundelin, J. (1994) Proc. Natl. Acad. Sci. USA 91, 9208–9212.35. Nystedt, S., Emilsson, K., Larsson, A.K., Strombeck, B. & Sundelin, J. (1995) Eur. J. Biochem. 232, 84–89.36. Gerszten, R.E., Chen, J., Ishii, M., Ishii, K., Wang, L., Nanevicz, T., Turck, C.W., Vu, T.-H. K. & Coughlin, S.R. (1994) Nature (London) 368, 648–

651.37. Chen, J., Bernstein, H.S., Chen, M., Wang, L., Ishii, M., Turck, C.W. & Coughlin, S.R. (1995) J. Biol. Chem. 270, 23398–23401.38. Hung, D.T., Vu, T.K., Wheaton, V. L, Ishii, K. & Coughlin, S.R. (1992) J. Clin. Invest. 89, 1350–1353.39. Brass, L.F., Vassallo, R.R., Belmonte, E., Ahuja, M., Cichowski, K. & Hoxie, J.A. (1992) J. Biol. Chem. 267, 13795–13798.40. Molino, M., Bainton, D.F., Hoxie, J.A., Coughlin, S.R. & Brass, L.F. (1997) J. Biol. Chem. 272, 6011–6017.41. Derian, C.K., Santulli, R.J., Tomko, K.A., Haertlein, B.J. & Andrade-Gordon, P. (1995) Thromb. Res. 6, 505–519.42. Connolly, T.M., Condra, C., Feng, D.M., Cook, J.J., Stranieri, M.T., Reilly, C.F., Nutt, R.F. & Gould, R.J. (1994) Thromb. Haemostasis 72, 627–633.43. Connolly, A.J., Ishihara, H., Kahn, M.L., Farese, R.V. & Coughlin, S.R. (1996) Nature (London) 381, 516–519.44. Kahn, M.L., Nakanishi-Matsui, M., Shapiro, M.J., Ishihara, H. & Coughlin, S.R. (1999) J. Clin. Invest. 103, 879–887.45. Ishihara, H., Zeng, D., Connolly, A.J., Tam, C. & Coughlin, S.R. (1998) Blood 91, 4152–4157.

HOW THE PROTEASE THROMBIN TALKS TO CELLS 11026

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 74: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

46. Bernatowicz, M.S., Klimas, C.E., Hartl, K.S., Peluso, M., Allegretto, N.J. & Seiler, S.M. (1996) J. Med. Chem. 39, 4879–4887.47. Cook, J.J., Sitko, G.R., Bednar, B., Condra, C., Mellott, M.J., Feng, D.M., Nutt, R.F., Shafer, J.A., Gould, R.J. & Connolly, T.M. (1995) Circulation

91, 2961–2971.48. Hattori, R., Hamilton, K.K., Fugate, R.D., McEver, R.P. & Sims, P.J. (1989) J. Biol Chem. 264, 7768–7771.49. Daniel, T.O., Gibbs, V.C., Milfay, D.F., Garavoy, M. & Williams, L.T. (1986) J. Biol. Chem. 261, 9579–9582.50. Colotta, F., Sciacca, F.L., Sironi, M., Luini, W., Rabiet, M.J. & Mantovani, A. (1994) Am. J. Pathol. 144, 975–985.51. Lum, H. & Malik, A.B. (1994) Am. J. Physiol. 267, L223-L241.52. Chen, L.B. & Buchanan, J.M. (1975) Proc. Natl. Acad. Sci. USA 72, 131–135.53. McNamara, C.A., Sarembok, I.J., Gimple, L.W., Fenton, J.W., II, Coughlin, S.R. & Owens, G.K. (1992) J. Clin. Invest. 91, 94–98.54. Darrow, A.L., Fung, L.W., Ye, R.D., Santulli, R.J., Cheung, W.M., Derian, C.K., Burns, C.L., Damiano, B.P., Zhou, L., Keenan, C.M., et al. (1996)

Thromb. Haemostasis 76, 860–866.55. Sun, W.Y., Witte, D.P., Degen, J.L., Colbert, M.C., Burkart, M.C., Holmback, K., Xiao, Q., Bugge, T.H. & Degen, S.J. (1998) Proc. Natl. Acad. Sci.

USA 95, 7597–7602.56. Xue, J., Wu, Q., Westfield, L.A., Tuley, E.A., Lu, D., Zhang, Q., Shim, K., Zheng, X. & Sadler, J.E. (1998) Proc. Natl. Acad. Sci. USA 95, 7603–

7607.57. Cui, J., O’Shea, K.S., Purkayastha, A., Saunders, T.L. & Ginsburg, D. (1996) Nature (London) 384, 66–68.58. Suh, T.T., Holmback, K., Jensen, N.J., Daugherty, C.C., Small, K., Simon, D.I., Potter, S. & Degen, J.L. (1995) Genes Dev. 9, 2020–2033.59. Shivdasani, R.A., Rosenblatt, M.F., Zucker, F.D., Jackson, C.W., Hunt, P., Saris, C.J. & Orkin, S.H. (1995) Cell 81, 695–704.60. Soifer, S.J., Peters, K.G., O’Keefe, J. & Coughlin, S.R. (1993) Am. J. Pathol 144, 60–69.61. Levine, L. (1994) Prostaglandins 47, 437–449.62. Selak, M. (1994) Biochem. J. 297, 269–275.63. Røttingen, J.A., Enden, T., Camerer, E., Iversen, J.G. & Prydz, H. (1995) J. Biol Chem. 270, 4650–4660.64. Camerer, E., Røttingen, J.A., Iversen, J.G. & Prydz, H. (1996) J. Biol. Chem. 271, 29034–29042.65. Schmidt, V.A., Nierman, W.C., Maglott, D.R., Cupit, L.D., Moskowitz, K.A., Wainer, J.A. & Bahou, W:F. (1998) J. Biol Chem. 273, 15061–15068.66. Kahn, M.L., Hammes, S.R., Botka, C. & Coughlin, S.R. (1998) J. Biol. Chem. 273, 23290–23296.

HOW THE PROTEASE THROMBIN TALKS TO CELLS 11027

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 75: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

VanX, a bacterial D-alanyl-D-alanine dipeptidase: Resistance,immunity, or survival function?

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation,” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

IVAN A. D. LESSARD AND CHRISTOPHER T. WALSH*Biological Chemistry and Molecular Pharmacology Department, Harvard Medical School, 240 Longwood Avenue, Boston, MA

02115ABSTRACT The zinc-containing D-alanyl-D-alanine (D-Ala-D-Ala) dipeptidase VanX has been detected in both Gram-

positive and Gram-negative bacteria, where it appears to have adapted to at least three distinct physiological roles. In pathogenic vancomycin-resistant enterococci, vanX is part of a five-gene cluster that is switched on to reprogram cell-wall biosynthesis to produce peptidoglycan chain precursors terminating in D-alanylD-lactate (D-AlaD-lactate) rather than D-AlaD-Ala. The modified peptidoglycan exhibits a 1,000-fold decrease in affinity for vancomycin, accounting for the observedphenotypic resistance. In the glycopeptide antibiotic producers Streptomyces toyocaensis and Amylocatopsis orientalis, a vanHAX operon may have coevolved with antibiotic biosynthesis genes to provide immunity by reprogramming cell-walltermini to D-AlaD-lactate as antibiotic biosynthesis is initiated. In the Gram-negative bacterium Escherichia coli, which is neverchallenged by the glycopeptide antibiotics because they cannot penetrate the outer membrane permeability barrier, the vanX homologue (ddpX) is cotranscribed with a putative dipeptide transport system (ddpABCDF) in stationary phase by thetranscription factor RpoS (σs). The combined action of DdpX and the permease would permit hydrolysis of D-AlaD-Alatransported back into the cytoplasm from the periplasm as cell-wall crosslinks are refashioned. The D-Ala product could thenbe oxidized as an energy source for cell survival under starvation conditions.

Much attention has been focused recently on the alarming increase in antibiotic resistance in bacterial pathogens (1–3). Theexplosive emergence of vancomycin-resistant enterococci as life-threatening organisms in hospital settings worldwide (4–6) has led tointensive investigation of the molecular determinants of glycopeptide antibiotic resistance (7–11). These investigations have revealedone of the most sophisticated molecular systems of acquired resistance and a paradigm of genetic adaptation (4). Vancomycinresistance uses a strategy of reprogramming the termini of peptidoglycan (PG) intermediates in cell-wall crosslinking steps from D-alanyl-D-alanine (D-Ala-D-Ala) termini to D-alanyl-D-lactate (D-Ala-D-lactate) termini. The modified PG binds vancomycin 1,000-foldless avidly than the D-Ala-D-Ala PG because of the loss of a central hydrogen bond from the NH of the D-Ala-D-Ala moiety to thevancomycin backbone carbonyl, accounting quantitatively for the gain in phenotypic resistance (9) (Fig. 1 A–C). A three-gene operonvanHAX found on a transposable element directs the reprogramming with VanH and VanA proteins acting sequentially to synthesize D-Ala-D-lactate while VanX selectively hydrolyzes D-Ala-D-Ala produced by the host enzyme but not D-Ala-D-lactate, allowing thedepsipeptide to accumulate and become incorporated into the growing PG termini (9, 10, 12). The amounts of VanH, -A, and -X in thecells are in turn controlled by a two-component regulatory system involving a transmembrane sensor kinase VanS and a responseregulating transcription factor VanR that becomes active when phosphorylated by VanS (11, 13), after the established paradigms formonitoring of environmental cues.

Although all five of the necessary and sufficient proteins, VanR, -S, -H, -A, and -X, have now been characterized, this paperaddresses some broader biological questions that have recently arisen around the functions of VanX in diverse bacterial physiology. Inparticular, VanX homologues have been discovered in the bacteria that produce vancomycin and related glycopeptide antibiotics (14,15) as well as in Escherichia coli, a Gram-negative bacterium that is intrinsically indifferent to vancomycin because of the failure of theantibiotic to penetrate the outer membrane barrier (15) (Table 1).

Enterococcal VanX (EntVanX): A Zinc-Dependent D-ALA-D-Ala Dipeptidase of Exquisite Specificity. The first indication offunction of EntVanX was provided by Reynolds et al. (10) with the observation that overproduction in E. coli led to activity in thecrude extract that hydrolyzed D-Ala-D-Ala but not D-Ala-D-lactate in a β-lactam-insensitive manner. The purification of EntVanX wasthen undertaken in this laboratory (16, 17) with maltose-binding protein (MBP)-EntVanX fusion under control of the T7 promoterbeing used to solve problems of protein aggregation, purification, and most notably toxicity to E.coli (17). The substrate specificity wasexclusive for D, D-dipeptides with unmodified N and C termini, and catalytic efficiency analysis suggested up to 1010-fold selection forD-Ala-D-Ala hydrolysis compared with D-Ala-D-lactate, a contrathermodynamic selection for amide over ester bond hydrolysis (15, 16)(Table 2). The MBP-EntVanX active site binds one catalytically essential zinc atom (17). Sequence analysis did not detect consensuscatalytic zinc-binding motifs, but comparison with the functional homolog zinc-dependent N-acyl-D-Ala-D-Ala carboxypeptidase fromStreptomyces albus G and with the zinc-containing N-terminal domain of murine Sonic hedgehog suggested a motif using His-116,Asp-123, and His-184 (EntVanX) as the zinc ligand set with a conserved Glu-181 as a catalytic base. These predictions were validatedfirst by site-directed mutagenesis to correlate zinc content and catalytic activity (17) and most recently by the determination of the x-raystructure of EntVanX by Bussiere et al. at Abbott Laboratories (18) of the free enzyme as well as complexes with D-Ala-D-Ala and aslow binding phosphinate analog (19) of the proposed tetrahedral reaction intermediate (Fig. 2 A and B). The structure indicates thatEntVanX is a variant of a metallo aminopeptidase and that the small constricted active site cavity of 150 Å3 may make rational designof inhibitors a significant

*To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviations: A2pm, diaminopimelate; D-Ala-D-Ala, D-alanyl-D-alanine; D-Ala-D-lactate, D-alanyl-D-lactate; EntVanX, enterococcal

VanX; DdpX, Escherichia coli VanX homolog; PG, peptidoglycan; StoVanX, Streptomyces toyocaensis VanX homolog.

VANX, A BACTERIAL D-ALANYL-D-ALANINE DIPEPTIDASE: RESISTANCE, IMMUNITY, OR SURVIVAL FUNCTION? 11028

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 76: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

medicinal chemistry challenge. The likely mechanism for D, D-dipeptide hydrolysis, is shown in Fig. 2C with Glu-181 acting ascatalytic base and Arg-71 as a cationic coordinator both in the ground state and for stabilization of the developing negative charge inthe tetrahedral adduct (18, 20). The structure predicted Asp-123, Asp-142, and Tyr-21 residues for the recognition of the D-Ala-D-Aladipeptide substrate α-NH3+ and Ser-114 for the carboxylate group. Site-directed mutagenesis of active site residues has revealed rolesconsistent with predictions of recognition and catalysis (20).

FIG. 1. (A) Vancomycin binds the D-Ala-D-Ala moiety of the growing peptidoglycan and sterically occludes thetransglycosylation and transpeptidation steps of cell-wall assembly. The immature cell wall results in cells susceptible to lysisthrough osmotic shock. (B) The alternative cell-wall biosynthetic pathway of the VanH, -A, -X proteins, producingpeptidoglycan intermediates with D-Ala-D-lactate termini in place of the usual D-Ala-D-Ala termini (12). Pyruvate is reducedto D-lactate by the NADP-dependent dehydrogenase VanH, which is then used as substrate for the ATP-dependent D-Ala-D-lactate depsipeptide ligase VanA. The product D-Ala-D-lactate depsipeptide is used by the enzyme MurF to produce themuramyl-peptidyl-D-lactate intermediate and brought forward in subsequent cell-wall biosynthesis. The zinc-dependent D, D-dipeptidase VanX, specifically hydrolyzes the D-Ala-D-Ala dipeptide pool produced by the native D-Ala-D-Ala Ddl ligasewithout hydrolyzing the D-Ala-D-lactate and in this way effectively shunts the flux of the cell-wall biosynthesis to the estertermini. Substitution of D-Ala by D-lactate does not impair crosslinking of the modified precursors to the growingpeptidoglycan chain, resulting in a mechanically strong peptidoglycan layer and cell survival. (C) Structures of thevancomycin complexes with N-acyl-D-Ala-D-Ala and N-acyl-D-Ala-D-lactate (9). Vancomycin binds to the D-Ala-D-Alatermini through a five-hydrogen bond network. The key hydrogen bond between the D-Ala amide NH and the vancomycinbackbone carbonyl is lost in the N-acyl-D-Ala-D-lactate complex, resulting in a 1,000-fold reduction in the affinity of theantibiotic.

Table 1. VanX homologsVanX source RoleVancomycin-resistantenterococciEnterococcus faeciumEnterococcus faecalis

Reprogram cell walls for vancomycin resistance inopportunistic pathogens

Glycopeptide producersStreptomycestoyocaensisAmycolatopsis orientalis

Coevolution of vanHAX operon with antibiotic biosynthesisgenes for immunity

Stationary-phase survivalmechanismEscherichia coli

Transport D-Ala-D-Ala from periplasm back to cytoplasm ascell-wall crosslinks are refashioned and use as RpoS-mediatedenergy source

VanX Homologs in the Bacteria That Produce Vancomycin and Related Glycopeptide Antibiotics. In many instances,bacteria that produce antibiotics have evolved strategies and mechanisms that provide immunity to the action of the antibiotic, and thereis a general supposition that immunity mechanisms will have coevolved with antibiotic biosynthesis genes to protect the producingorganisms (21, 22). Streptomyces toyocaensis synthesizes and secretes a vancomycin-type glycopeptide antibiotic (A47934), and themolecular basis of immunity for this organism and most likely for Amylocatopsis orientalis, which produces vancomycin, has beenrecently deconvolved (14, 15, 23, 24). PCR probes to EntVanX zinc-binding motif revealed an S.toyocaensis VanX homologue(StoVanX) with 63% similarity to EntVanX, and sequencing analysis then indicated a three-gene operon in S.toyocaensis andA.orientalis equivalent and similarly oriented to the vanHAX operon from (Fig. 3A). Expression and purification of the StoVanXconfirms it is a high efficiency D,D-dipeptidase with unmodified N and C termini and that it. lacks D-Ala-D-lactate depsipeptide activity(15) (Table 2). These findings suggest a conserved mechanism for the observed intrinsic resistance of the antibiotic producers to thevancomycin class of glycopeptides and that before S.toyocaensis produces the glycopeptide A47934, it has D-Ala-D-Ala peptidoglycantermini and is sensitive to vancomycin-type antibiotics. Furthermore, S.toyocaensis possesses two D-, D-ligases: a D-Ala-D-lactate ligaseencoded by the vanHAX operon and a D-Ala-D-Ala ligase encoded by a separate gene on the chromosome (24).

VANX, A BACTERIAL D-ALANYL-D-ALANINE DIPEPTIDASE: RESISTANCE, IMMUNITY, OR SURVIVAL FUNCTION? 11029

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 77: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

The S.toyocaensis vanHAX equivalents may be switched on transcriptionally when the host turns on the cluster of genes to synthesizethe glycopeptide A47934 to reprogram PG termini to end in D-AlaD-lactate, providing in situ resistance to the produced antibiotic(Fig. 3B). This three-gene operon has also been detected in other glycopeptide-producing organisms (14), and one of these operons mayhave been the origin for the enteroccocal vanHAX genes that are on transposable elements in most of the VanA clinical phenotypes ofvancomycin-resistant enterococci (VRE). Noticeably, the G+ C content of the vanHAX operon in VRE is 5–10% higher than theadjacent vanSR genes and chromosomal genes of enterococci. These findings also exemplify a mechanism for coevolution ofglycopeptide antibiotic production and glycopeptide antibiotic resistance, the latter then appropriated by the opportunistic pathogenicenterococci.

FIG. 2. (A) Structure of EntVanX (18). (B) Active site topology of EntVanX complex with the phosphinate analog (18). Thezinc atom is coordinated with His-116, Asp-123, and His-184. The phosphinate analog α-NH3+ hydrogen bonds withAsp-123, Asp-142, and Tyr-21, whereas Ser-114 hydrogen bonds with the carboxylate group. Arg-71 stabilizes the transitionstate intermediate, represented by the phosphinate analog. Glu-181 is the catalytic base. (C) Proposed mechanism of VanX(20). The water molecule is activated by Glu-181 and attacks the zinc-polarized carbonyl to form a tetrahedral adduct, whichis then stabilized by both the zinc atom and the Arg-71. The Glu-181 transfers the proton to the nitrogen, which is hydrogenbonded to the carbonyl group of Tyr-109; peptide bond cleavage follows [C; reprinted from ref. 20 with kind permission fromElsevier Science (Amsterdam)].

Table 2. Catalytic efficiencies of VanX homologs on zinc-dependent D-AlaD-Ala dipeptidases (15)

VanX enzymes Mol % zinc content KM µM kcat s–1 kcat/KM s–1mM–1

EntVanX(Enterococcus faecalis)

95 80 26 325

StoVanX(Streptomyces toyocaensis)

84 4 12 3,000

DdpX(Escherichia coli)

100 14,000 170 12

Substrate specificity:*D-, D-dipeptides with unmodified N or C termini,*Does not hydrolyze esters, tripeptides, or dipeptides of L/L or mixed diastereomeric configuration (L/D or D/L).

The Dilemma for E.coli Strains That Contain and Express the VanX Homolog (ddpX). Analysis of the E.coli genomedatabase turned up a possible VanX homologue [originally referred to as EcoVanX (15) and renamed here DdpX] with 27% similarityto EntVanX. Expression and purification validated the expected activity, although the KM of 14 mM for D-AlaD-Ala was 250- to 3,000-fold elevated compared with the EntVanX and StoVanX enzymes (15), consistent with a purely degradative function for the DdpX(Table 2). All of the active site residues and auxiliary residues that maintain the active-site topology in EntVanX are conserved inDdpX, and kinetic analysis also revealed the same substrate specificity and discrimination between peptide bond cleavage (D-AlaD-Ala)

VANX, A BACTERIAL D-ALANYL-D-ALANINE DIPEPTIDASE: RESISTANCE, IMMUNITY, OR SURVIVAL FUNCTION? 11030

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 78: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

and the analogous ester-bond cleavage (D-Ala-D-lactate) reported for EntVanX and StoVanX (15). Homology modeling of DdpX withthe crystal structure of EntVanX divulges a striking similarity in the overall structure, with the highest identity seen within the keycatalytic residues, as expected for the similarity in substrate specificity (15).

FIG. 3. (A) Comparison of the proposed peptidoglycan termini before and after A47934 antibiotic production byS.toyocaensis. (B) Comparison of the glycopeptide resistance gene operon from vancomycin-resistant enterococci(Enterococcus faecalis) and the glycopeptide producer S.toyocaensis (14).

It was not immediately apparent why a Gram-negative bacterium such as E.coli would contain a VanX enzyme, because the outermembrane barrier provides effective intrinsic resistance to the glycopeptide class of antibiotics. Further, there was no evidence of VanHor VanA homologs in the genome, so there would be no reprogramming of termini of peptidoglycan intermediates. A further aspect ofthe dilemma was our prior observation that expression of active EntVanX enzyme in E.coli was toxic and led to cell lysis (17),precisely what would be expected for hydrolytic removal of the key D-Ala-D-Ala building block required for cell-wall synthesis andcrosslinking.

The existence of Ddp raised the question whether there was any situation in which E.coli could live or would want to live withoutD-Ala-D-Ala for cell-wall biosynthesis. Inspection of the ddpX gene suggested two clues. First, immediately downstream was a five-gene cluster (ddpABCDF) withallthe hallmarks of a peptide permease cluster, including periplasmic binding protein, transmembraneproteins, and ABC subunits ATPase ORFs (Fig. 4A). Second, the promoter region of ddpX has two candidates for—10 consensussequences for the RpoS (σs) alternative sigma factor of RNA polymerase (15, 25). The RpoS subunit is switched on in early stationaryphase and is a central regulator of transcription of many genes that contribute to survival of the E.coli cell under starvation conditions(conditions that prevail in nature) (26). Indeed, analysis of the ddpX promoter fused to lacZ verified that ddpX is turned on on entry intostationary phase and furthermore that the mRNA also shown to be produced in stationary phase included the five adjacent candidatepermease genes (ddpABCDF) (15). This operon has been named ddpXABCDF (D, D-peptide). When E.coli was assessed for its abilityto grow on D-Ala or D-Ala-D-Ala as the sole carbon source, it could use the monomer, oxidized by the membrane enzyme D-amino aciddehydrogenase, but not the dipeptide unless both the ddpX and the five permease genes were specifically up-regulated: the permeasecan therefore transport D-Ala-D-Ala into the cell (15). At this juncture, the pathway depicted in Fig. 4B can be understood. In stationaryphase, the D, D-dipeptide permease, DdpX, and pyruvate oxidase areallproduced under RpoS control to enable the import and netoxidation of the D, D-dipeptide to two molecules of acetate and CO2, while eight electrons are funneled down the respiratory chain toprovide energy for survival.

FIG. 4. (A) Gene organization at 33.7 min of the E.coli chromosome. ddpXABCDF, orfX, hypothetical protein gene product;osmC, gene for osmotically inducible protein; dipeptide permease homolog gene products are indicated. The ddpX anddipeptide permease genes (ddpABCDF) form an operon (ddpXABCDF) that is turned on at entrance into stationary phase bythe RpoS (σs) transcription factor of RNA polymerase. The consensus (25) and putative RpoS (σs)—10 regions are indicated.(B) Proposed action of the DdpX and the dipeptide permease (DdpABCDF) (15). During stationary phase, periplasmic D-Ala-D-Ala dipeptide is transported by the Ddp permease transport system into the cytoplasm where it could be processed by theDdpX to release two equivalents of D-Ala. The D-Ala monomer is then converted to acetate by the sequential action of D-amino acid dehydrogenase (D-ADH) and pyruvate oxidase (POX) for production of energy during starvation. Periplasmic D-Ala-D-Ala could arise during the A2pm-A2pm crosslink formation, which increases up to 13% of total peptidoglycancrosslinks during stationary phase (26).

The last question is where free D-Ala-D-Ala in the periplasm comes from under starvation conditions. One possible source is fromrelease during a crosslinking of the peptidoglycan layer.

VANX, A BACTERIAL D-ALANYL-D-ALANINE DIPEPTIDASE: RESISTANCE, IMMUNITY, OR SURVIVAL FUNCTION? 11031

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 79: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

It has been reported that the frequency of a direct crosslink between two diaminopimelate (A2pm) residues on adjacent PG strands risesfrom 2% in exponential phase to 13% in stationary phase (27). Although the substrate strands for this unusual crosslink have yet notbeen identified, if they are the normal pentapeptide strands, this crosslink would release the D-Ala-D-Ala dipeptide. The E.coli A2pm-A2pm transpeptidase has not been identified, but it has been suggested that the L, D-dipeptidylcarboxypeptidase cleaving themuramylpeptidyl-L-A2pm-D-Ala-D-Ala at the L-A2pm-D-Ala peptide bond releasing D-Ala-D-Ala previously described in E.coli (28)could be the long-sought A2pm-A2pm transpeptidase (15). The purpose of the reprogramming of the crosslinks in stationary phase maybe driven by the transshipment of the released D, D-dipeptide from periplasm back into cytoplasm to power the cell in starvation mode.Although the cell is catabolizing the cell wall, it cannot decrease the net crosslinking or the mechanical strength will be insufficient towithstand osmotic pressure for lysis, hence the need to switch to A2pm-A2pm linkages.

VanX, a Dipeptidase for All Seasons? The three examples noted in this paper reveal distinct niches for the zinc-dependent D-Ala-D-Ala dipeptidase. It may have arisen in the Gram-positive glycopeptide antibiotic producers at the same time as the ability tobiosynthesize these antibiotics, providing selective immunity to the bacteria that could both make the antibiotics and reprogram theircell walls to lower the target affinity. Other Gram-positive bacteria in the soil such as lactobacilli, leuconostoc, and pediococci areintrinsically resistant to vancomycin and were examined (29–32) to have also chosen the D-Ala-D-lactate route. In recent times, theopportunistic enterococci have imported the vanHAX gene operon on transposons and plasmids to gain survival advantage via antibioticresistance in hospital environments that have seen an order of magnitude increase in the therapeutic use of vancomycin in the past 15 yr(33). The Gram-negative E.coli is not challenged by the impermeable glycopeptide antibiotics and has its own version of VanX, but notVanH or VanA. DdpX is a potentially lethal enzyme, because it removes the necessary metabolite D-Ala-D-Ala during peptidoglycansynthesis and is turned on only in the extreme challenge of stationary phase when starvation threatens and the D-Ala-D-Ala termini ofuncrosslinked peptidoglycan strands are retrieved from the periplasm and burned as a metabolic fuel. As additional bacterial genomesare sequenced, more VanX protein homologs are likely to be discovered. Indeed, in the Gram-negative Synechocystis sp PCC6803, aVanX homolog (16% similarity with EntVanX) possessing kinetic parameters similar to DdpX was detected, but notably it hydrolyzesboth L, D- and D, D-dipeptides similarly. It is thus proposed to play a role in scavenging both L, D- and D, D-dipeptide products of cell-wall degradation pathways (15). In the Gram-positive pathogen Mycobacterium tuberculosis, the VanX homolog (21% similarity withEntVanX) possesses all the requirements necessary for dipeptide recognition and catalysis but presents an apparent signal sequence andmembrane lipoprotein attachment site, suggesting that MtuVanX might reside in the membrane.

We are grateful to Abbott Laboratories for providing the coordinates of the EntVanX crystal structure. We thank members of theWalsh laboratory for helpful and insightful discussions. I.A.D.L. acknowledges the Medical Research Council of Canada forPostdoctoral Fellowship supports. This research was supported in part by National Institutes of Health Grants GM21643 and by fundsfrom Abbott Laboratories.1. Neu, H.C. (1992) Science 257, 1064–1073.2. Tomasz, A. (1994) N.Engl. J.Med. 330, 1247–1251.3. Swartz, M.N. (1994) Proc. Natl. Acad. Sci. USA 91, 2420–2427.4. Leclercq, R. & Courvalin, P. (1997) Clin. Infect. Dis. 24, 545–554.5. Murray, E. (1997) Am. J. Med. 102, 284–293.6. Cunha, B.A. (1995) Med. Clin. N. Am. 19, 817–831.7. Arthur, M. & Courvalin, P. (1993) Antimicrob. Agents Chemother. 37, 1563–1571.8. Barna, J.C.J. & Williams, D.H. (1984) Annu. Rev. Microbiol 38, 339–357.9. Bugg, T.D.H., Wright, G.D., Dutka-Malen, S., Arthur, M., Courvalin, P. & Walsh, C.T. (1991) Biochemistry 30, 10408– 10415.10. Reynolds, P.E., Depardieu, F., Dutka-Malen, S., Arthur, M. & Courvalin, P. (1994) Mol Microbiol. 13, 1065–1070.11. Wright, G.D., Holman, T.R. & Walsh, C.T. (1993) Antimicrob. Agents Chemother. 36, 1514–1518.12. Walsh, C.T., Fisher, S.L., Park, I.-S., Prahalad, M. & Wu, Z. (1996) Chem. Biol. 3, 21–28.13. Arthur, M., Molinas, C. & Courvalin, P. (1992) J. Bacteriol. 174, 2582–2591.14. Marshall, C.G., Lessard, I.A.D., Park, I.-S. & Wright, G.D. (1998) Antimicrob. Agents Chemother. 42, 2215–2220.15. Lessard, I.A.D., Pratt, SD, McCafferty, D.G., Bussiere, D.E., Hutchins, C., Wanner, B.L., Katz, L. & Walsh, C.T. (1998) Chem. Biol 5, 489–504.16. Wu, Z., Wright, G.D. & Walsh, C.T. (1995) Biochemistry 34, 2455–2463.17. McCafferty, D.G., Lessard, I.A.D. & Walsh, C.T. (1997) Biochemistry 36, 10498–10505.18. Bussiere, D.E., Pratt, SD, Katz, L. Severin, J.M., Holzman, T. & Park, C. (1998) Mol. Cell 2, 75–84.19. Wu, Z. & Walsh, C.T. (1995) Proc. Natl. Acad. Sci. USA 92, 11603–11607.20. Lessard, I.A.D. & Walsh, C.T. (1999) Chem. Biol. 6, 177–187.21. Cundliffe, E. (1992) in Secondary Metabolites: Their Function and Evolution, Ciba Foundation Symposium 171 (Wiley, Chichester), pp. 199–214.22. Cundliffe, E. (1989) Annu. Rev. Microbiol. 43, 207–233.23. Marshall, C.G., Braodhead, G., Leskiw, B. & Wright, G.D. (1997) Proc. Natl. Acad. Sci. USA 94, 6480–6483.24. Marshall, C.G. & Wright, G.D. (1997) FEMS Microbiol. Lett. 157, 295–299.25. Espinosa-Urgel, M., Chamizo, C. & Tormo, A. (1996) Mol. Microbiol. 21, 657–659.26. Hengge-Aronis, R. (1996) in Escherichia coli and Salmonella, Cellular and Molecular Biology, eds. Neidhardt, F.C., Curtiss, R., Ingraham, J.L.,

Lin, E.C.C., Low, K.B., Magasanik, B., Reznikoff, W.S., Riley, M., Schaechter, M. & Umbarger, H.E. (Am. Soc. Microbiol., Washington,DC), pp. 1497–1512.

27. Tuomanen, E., Markiewicz, Z. & Tomasz, A. (1988) J. Bacteriol. 170, 1373–1376.28. Gondré, B., Flouret, B. & van Heijenoort, J. (1973) Biochimie 55, 685–691.29. Dartois, V., Phalip, V., Schmitt, P. & Divies, C. (1995) Cremoris. Res. Microbiol. 146, 291–302.30. Elsha, B.G. & Courvalin, P. (1995) Gene 152, 79–83.31. Park, I.-S. & Walsh, C.T. (1997) J. Biol. Chem. 272, 9210–9214.32. Billot-Klein, D., Gutmann, L., Sablé, S., Guittet, E. & van Heijenoort, J. (1994) J. Bacteriol. 176, 2398–2405.33. Kirst, H.A., Thompson, D.G. & Nicas, T.I. (1998) Antimicrob. Agents Chemother. 42, 1303–1304.

VANX, A BACTERIAL D-ALANYL-D-ALANINE DIPEPTIDASE: RESISTANCE, IMMUNITY, OR SURVIVAL FUNCTION? 11032

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 80: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Chaperone rings in protein folding and degradation

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation,” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

ARTHUR L. HORWICH*†‡, EILIKA U. WEBER-BAN*, AND DANIEL FINLEY§

*Department of Genetics and †Howard Hughes Medical Institute, Yale School of Medicine, New Haven, CT 06510; and§Department of Cell Biology, Harvard Medical School, Boston, MA 02115

ABSTRACT Chaperone rings play a vital role in the opposing ATP-mediated processes of folding and degradation ofmany cellular proteins, but the mechanisms by which they assist these life and death actions are only beginning to be understood. Ring structures present an advantage to both processes, providing for compartmentalization of the substrateprotein inside a central cavity in which multivalent, potentially cooperative interactions can take place between the substrateand a high local concentration of binding sites, while access of other proteins to the cavity is restricted sterically. Suchrestriction prevents outside interference that could lead to nonproductive fates of the substrate protein while it is present in non-native form, such as aggregation. At the step of recognition, chaperone rings recognize different motifs in their substrates,exposed hydrophobicity in the case of protein-folding chaperonins, and specific “tag” sequences in at least some cases of theproteolytic chaperones. For both folding and proteolytic complexes, ATP directs conformational changes in the chaperone ringsthat govern release of the bound polypeptide. In the case of chaperonins, ATP enables a released protein to pursue the nativestate in a sequestered hydrophilic folding chamber, and, in the case of the proteases, the released polypeptide is translocatedinto a degradation chamber. These divergent fates are at least partly governed by very different cooperating components thatassociate with the chaperone rings: that is, cochaperonin rings on one hand and proteolytic ring assemblies on the other. Here we review the structures and mechanisms of the two types of chaperone ring system.

Almostallproteins proceed through a life cycle circumscribed by their folding and degradation. Because both processes areexergonic, it was long assumed that they occur through straightforward molecular mechanisms or simply spontaneously, in the case offolding. Independent studies of these two processes, however, have recently revealed their dependence in vivo on large and remarkablyintricate molecular machines (refs. 1 and 2; Fig. 1). These complexes, like many other protein machines, are driven by ATP, but theircommon physical feature is a ring structure. The ATPase subunits within these machines form symmetric or pseudosymmetric rings of6–9 members, enclosing a central cavity (Fig. 2). The cavity defines the substrate binding site, and the substrate can enter or exit thiscavity by moving perpendicular to the plane of the ring. Folding substrates leave such rings by retracing their original path of entrywhereas proteolytic substrates appear to pass through the ring into a second, ATP-independent ring compartment containing proteolyticactive sites.

ATP-dependent chaperone rings have proven to be evolutionarily ubiquitous and include well studied protein-folding chaperonins,such as bacterial GroEL (3), the archaebacterial thermosome (4), and the eukaryotic CCT complex (ref. 5; Figs. 1 and 2). Chaperonerings serving as proteolytic assistants include the bacterial ClpA (6), ClpX (7), and HslU (8) and the eukaryotic 19S proteasome capstructure (regulatory particle), also known as PA700 (refs. 9 and 10; Figs. 1 and 2). In the case of chaperonins, their overall function iswell established: namely, assisting proteins to fold to their native form. In the case of the ring chaperones involved in proteolyticdegradation, their action appears to involve recognition of specific proteins, destabilization of their structure, and translocation ofunfolded polypeptide chains into associated proteolytic cylinders (see ref. 11).

The functional similarities between the ATPase rings of the chaperonins and the ATP-dependent proteases may be an example ofevolutionary convergence. In any case, there is no significant sequence similarity between these two types of ATPase rings. All knownATP-dependent proteases belong to the Walker family of ATPases, a vast and functionally diverse collection of enzymes (12). Bycontrast, the design of the ATPase domain of the chaperonins appears to be specific to chaperonins themselves (see, e.g., ref. 13). Inboth families of ATPases, large-scale conformational changes are dictated by the presence or absence of the γ phosphate of the boundadenine nucleotide. Thus, both systems are to a first approximation two-state systems, although, in the case of GroEL, anticooperativeinterplay between the two rings and asymmetric binding of GroES provide for at least one additional substate that is critical to theforward movement of the reaction cycle (see below). A detailed understanding of how the ATPase cycle drives proteolysis of proteinsubstrates has not yet been achieved for the ATP-dependent proteases.

The chaperone rings of the ATP-dependent proteases appear to play a preparative role, recognizing proteins slated for turnoverand promoting their unfolding, actions that the proteolytic cylinders cannot by themselves carry out. Indeed, in the absence of theassociating chaperone ring, proteolytic cylinders, such as bacterial ClpP or the eukaryotic 20S proteasome, degrade small peptidesinefficiently and are inactive on physiological protein substrates (see, e.g., refs. 14–16).

In contrast, some of the associating chaperone assemblies, when assayed in the absence of their proteolytic cylinders, retain theability to recognize physiological substrates and, moreover, appear to be able to dissociate oligomeric proteins or low order proteinaggregates (refs. 17–19; see also ref. 20). For example, in the case of ClpX-mediated dissociation of MuA transposase tetramer fromrecombined DNA (refs. 19 and 21; Fig. 1), the cognate ClpP protease is apparently prevented from acting on MuA at the transposasecomplex, perhaps by inability to associate with ClpX in this setting. Thus, the relationship between the two actions of protease-associating ring assemblies, assistance to degradation and

‡To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviation: EM, electron microscopy.

CHAPERONE RINGS IN PROTEIN FOLDING AND DEGRADATION 11033

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 81: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

oligomeric dissociation, may be governed by whether the ring is associated with a cognate proteolytic cylinder.

FIG. 1. Schematic illustration of the role of chaperone rings in ATP-dependent protein folding and unfolding/degradation inprokaryotic and eukaryotic cells. Protein folding chaperonins are illustrated in the upper portion of each “cell,” andproteolytic chaperones and the associated proteolytic cylinders are shown in the lower portion. In the case of the prokaryoticClp components, the homohexameric ATPases, ClpA or ClpX, form coaxial associations with the termini of the double ringcylindrical serine protease, ClpP, delivering recognized substrates to it for degradation (see text). In the absence ofassociation with ClpP, however, ClpA or ClpX can mediate disassembly of oligomeric substrate proteins, exemplified byClpX-mediated disassembly of the MuA transposase tetramer. Note the two chaperonin classes in the eukaryotic cell(cytosolic and mitochondrial). In the case of the eukaryotic proteasome, the general pathways of ubiquitination to directproteins for degradation by the proteasome are shown. Not shown is the presence of the proteasome in the nuclearcompartment, where similar pathways of turnover appear to be operative.

Notably, other ring-shaped proteolytic assemblies in the cell have covalently linked the ATPase and protease functions in onepolypeptide, as in the FtsH bacterial membrane metalloprotease or the related AAA-ATPase containing proteases of the mitochondrialinner membrane, Yta10–12 and Yme1 (refs. 22–26; Fig. 1). Joining of the two functions within one polypeptide is not restricted to themembrane proteases; the soluble bacterial protease Lon and its mitochondrial homolog, PIM1, are similarly designed (25). Theprinciples of action of these proteases may be the same as those assemblies composed of distinct chaperone and proteolytic rings, butwe confine our discussion here to the latter situation, in which the chaperone moiety is amenable to analysis both on its own and in abinary complex with the proteolytic component.

Architecture-Function Considerations

Both chaperonins and the protease-associating chaperone rings, the latter often referred to as regulatory complexes, are radiajlysymmetric (or pseudosymmetric) assemblies of �110– 140 Å diameter, housing axial cavities (refs. 6 and 27; Fig. 2). Chaperonins arecomposed of two back-to-back rings whose axial cavities are blocked at the equatorial “base” of each ring by the collective of COOHtermini of the surrounding subunits, which protrude into the central space (28). (The COOH termini are not resolvablecrystallographically because of disorder from a GGM repeat sequence, but the collective of termini is visible as a mass in cryoEM.)Thus, chaperonins contain two noncontiguous cavities, 45–65 Å in diameter, one at each end of the cylindrical structure. The cavitiesare formed by surrounding apical domains, attached on hinges to small intermediate domains, hinged in turn to the equatorial base(Fig. 2). The central cavities have been identified by electron microscopy (EM) and functional studies as the sites of binding of non-native polypeptide, which, at least in the case of the bacterial chaperonin, GroEL, occurs through hydrophobic side chains exposed onthe cavity wall (see ref. 29). These side chains apparently bind exposed hydrophobic surfaces specifically present in non-native proteins.

The folding-active state of GroEL is produced when both ATP and the cochaperonin GroES bind to the polypeptide-containingring; the apical domains of the bound ring undergo large conformational movements, 60° upward rotation and 90° clockwise twistingmotion, that move the hydrophobic binding sites away from the cavity, releasing the bound protein into what is now a sequesteredspace that is “capped” by GroES and enlarged 2-fold in volume (refs. 3 and 30; Fig. 2). The walls of the cavity assume a hydrophiliccharacter that favors burial of hydrophobic residues in the folding substrate protein and exposure of hydrophilic residues, promotingfolding to the native state.

Protease-associated chaperone rings also exhibit axial cavities but, in contrast with those of chaperonins, these seem likely to be,in the active state, continuous channels through which recognized substrate proteins can be translocated into the central space of theassociated proteolytic cylinder (11). The diameter of such channels is somewhat uncertain, lacking crystallographic resolution so far,but recent cryoEM studies approximate the cavity in bacterial ClpA to 70–80 Å at the widest point, narrowing down to a 10- to 20-Åpassageway at the end that interfaces with ClpP (6). For its own part, ClpP, in a stand-alone crystal structure, exhibits a central openingat its terminal ends of �10 Å (ref. 31; Fig. 2). This opens into a cavity of >50 Å height and diameter. In the case of the crystal structureof the yeast 20S proteasome (32), there is no detectable axial opening into the chamber, with the NH2 termini of the α-subunitsobstructing passage (Fig. 2). This implies a gating action by the ATP-dependent association of the 19S “cap” complex with theproteolytic cylinder. Indeed, in the case of the proteasome, a substitution in the ATP binding site of one of six ATPases in the 19Scomplex (Rpt2) results in a strong inhibition of the peptidase activity of the proteasome, suggesting that even peptides cannot traversethe channel without involving an ATP-directed gating mechanism (33). The small size and apparent gating of the passageways into theproteolytic cylinders appear likely to exclude the bulk of cellular proteins from the lumen of the proteolytic cyliner. At the same time, arequirement is imposed that proteins must be unfolded before their translocation into the proteolytic cylinder. In fact, ClpA alone hasbeen shown to act as an unfoldase in vitro, globally unfolding a monomeric substrate

CHAPERONE RINGS IN PROTEIN FOLDING AND DEGRADATION 11034

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 82: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

protein (79). Translocation through the channel may constitute the first committed step in proteolysis by these ATP-dependent proteases.

FIG. 2. Architecture of the eukaryotic proteasome and bacterial ClpAP chaperone-protease complexes and of the bacterialGroEL-GroES chaperonin pair. Side views from electron microscopy of the eukaryotic 26S proteasome (Left) and bacterialClpAP (Center) showing the respective chaperone assemblies associated with the respective proteolytic cylinders (taken fromref. 11). The stoichiometries of the constitutent oligomeric rings are designated by subscripts; note that the eukaryoticproteasome is composed of seven distinct α. subunits and seven distinct β subunits arranged 2-fold symmetrically to composethe four rings. Shown below are space-filling cutaway images of the proteolytic cylinders, derived from the crystal structuresof Wang et al. (31) and Groll et al. (32), with active sites shown as red dots, as well as ribbon diagrams of their entryways,also taken from ref. 11. A space-filling view of the GroEL-GroES-ADP7 asymmetric chaperonin complex is shown (UpperRight), taken from Xu et al. (3), illustrating the differences between GroEL rings in the polypeptide-accepting and folding-active states. The open trans ring of the asymmetric complex exposes hydrophobic residues (shown in yellow) that cancapture a non-native polypeptide. Subsequent GroES/ATP binding to the ring with polypeptide replaces this surface with ahydrophilic one (shown in blue), enlarges the cavity 2-fold in volume, and encapsulates the space in which a polypeptide,released from the hydrophobic binding sites, pursues folding in solitary confinement. Below, the rigid body movements ofapical (red) and intermediate (green) domains of GroEL that occur on GroES binding are shown, taken from Xu et al. (3).The apical peptide binding surfaces of helices H and I (arrows), as well as an underlying segment, are removed from facingthe central cavity to a position rotated upward 60° and twisted 90° clockwise (see text and ref. 3 for details).

In the case of both the bacterial and eukaryotic chaperone components, the rings apposed coaxially to the proteolytic cylinder arecomposed of six ATPase-containing subunits (6, 33, 34). Considering that the cognate proteolytic cylinders are 7-membered double orquadruple rings (see, e.g., refs. 31, 32, 36), with the exception of six-fold symmetric HslV (35), there is an obvious symmetrymismatch. With such a 6-on-7 interface, the chaperone subunits cannot form a 1-to-1 match with proteolytic subunits in the same waythat, for example, GroEL subunits match up exactly with subunits of the GroES cochaperonin partner (3). It is unclear how this unusualand evolutionarily preserved behavior may translate into a functional role. Is it designed to inherently weaken the association betweenthe two components? This seems unlikely, because most chaperone/protease complexes appear to be stable as long as ATP is present.The symmetry mismatch may dispose to rotational sliding or ratcheting of the faces of the respective rings across each other (6).Perhaps it is a manifestation of a mechanism of translocation of substrate protein down the axial channel, such that a polypeptide chainis “spooled” through a narrow opening into the proteolytic chamber by a rotational or ratcheting motion (see, e.g., ref. 36). This modelcannot apply toallATP-dependent proteases, however. As mentioned above, the ATPase and proteolytic domains are contained within asingle polypeptide in the Lon and membrane-bound metalloproteases, where linking of these domains would prevent relative rotation.Interestingly, in EM images of the eukaryotic proteasome, the two asymmetric 19S complexes are observed in a 2-fold rotationalorientation with respect to each other, potentially requiring coupled rotation to satisfy a ratchet model (e.g., ref. 10; see Fig. 2)

In the case of the proteasomal cap structure, not only is there an eight-subunit “base” containing six ATPase subunits, but also an�400-kDa “lid” structure, comprising eight subunits in yeast, connected to the base by what looks like a “hinge” in EM images (ref. 37;Fig. 2). When the lid is removed from the yeast proteasome by a mutation eliminating a protein supporting the connection to the base(Rpn10), ubiquitinated proteins can no longer be degraded. Thus, the lid appears to be specifically required for recognition of ubiquitinconjugates. By contrast, with only the base structure remaining attached to the 20S proteasome rings, a nonubiquitinated protein,casein, can still be efficiently degraded (37). These observations would seem to support a model of recognition wherein the lid structurebinds the ubiquitin moiety of a ubiquitinated protein while the

CHAPERONE RINGS IN PROTEIN FOLDING AND DEGRADATION 11035

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 83: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

base either simultaneously or subsequently binds the adjoined substrate moiety.It should be pointed out that ubiquitin conjugation of a protein per se is not associated with its unfolding, although in many cases

ubiquitin conjugation may be activated by unfolding of a protein, which exposes sequences or non-native structures recognizable by theubiquitin conjugation system (see, e.g., ref. 38). In other cases, a ubiquitin-conjugated native substrate structure is presented to theproteasomal regulatory particle and must somehow be unfolded by it, or perhaps trapped in a spontaneously unfolded state. Themechanism of such unfolding remains unclear. Does ubiquitin itself participate in the unfolding process? If ubiquitin and substrate bindat multiple points to the lid and base, then ATP-mediated conformational change could exert a shearing force on the attached substrateprotein that would act to unfold it. It seems that the lid structure would, at a minimum, allow retention of ubiquitinated proteins inproximity to the base apparatus, kinetically favoring interaction with it and the consequent ATP-dependent unfolding and translocationinto the proteolytic cylinder.

The use by the proteasome of a tag by which to hold the substrate in place while it is exposed to an unfolding machinery is amechanistic feature quite distinct from anything used by the chaperonins or other classical molecular chaperones. If a proteasomesubstrate undergoes a failed trial of unfolding, it is unlikely to dissociate because it will presumably remain tethered to the proteasomevia the ubiquitin tag. This arrangement may account for the remarkable observation that the proteasome will degrade almost anysoluble protein if it is ubiquitinated. However, stabilization of the folded state of the ubiquitinated protein can apparently preventunfolding and degradation, as was indicated by an experiment with a dihydrofolate reductase variant recruited to the proteasome via theN-end rule pathway. When the folded state of the dihydrofolate reductase was stabilized by binding its ligand, methotrexate, it was nolonger subject to degradation (39).

In the same way that the lid presumed to recognize ubiquitin lies atop the ATPase base in the proteasome cap, domains that mayhave analogous function seem to be present in the bacterial system in some cases. For example, subunits of the ClpA ring contain asecond major domain attached through a hinge to the base (refs. 6 and 7; Fig. 2). A second ATPase motif is present in this domain,which probably corresponds to the NH2-terminal portion of ClpA because the COOH-terminal portion is homologous to ClpX, whichalso associates with ClpP (40). How these domains participate in binding, unfolding, and translocation will require structural study—both EM and crystallographic—as well as functional analyses.

Substrate Protein Recognition

Because ubiquitin is clearly the major recognition determinant for the eukaryotic proteasome, the specificity of its substrateprotein recognition is accounted for mainly at the level of the ubiquitin conjugation system. The remarkable set of E3 ubiquitin proteinligases involved in this process appears to be large in number, and the nature of molecular recognition by these gatekeepers is underintensive study (for reviews, see refs. 38 and 41; see Fig. 1). The subunits involved in recognition of ubiquitinated proteins remain to beidentified. Subunit S5a/ Rpn10 of the proteasome regulatory particle specifically binds multiubiquitin chains in vitro (42). This hasbeen observed with the subunit derived from a number of species, including human, Drosophila melanogaster, Saccharomycescerevisiae, and Arabidopsis thaliana. Yet studies in vivo in S.cerevisiae, and more recently in plants, show that the ubiquitin chainbinding site in S5a/Rpn10 does not have a significant involvement in the degradation of ubiquitinated proteins (see, e.g., refs. 43 and44; R.Vierstra, personal communication).

In the case of the bacterial chaperone/protease pairs, at least one means of designation for proteolysis involves a taggingmechanism reminiscent of ubiquitination, although it is cotranslational (45, 46). A peptide encoded by the ssrA gene is used to mark fordegradation incomplete nascent polypeptide chains that become stalled at the ribosome because the encoding messages have beenprematurely transcriptionally terminated or nuclease-cleaved (ref. 46; Fig. 1). This RNA is a remarkable 362-base hybrid RNA whose5� end resembles tRNA-ala and whose 3� end encodes a 10-residue peptide (ANDENYALAA) followed by an ochre terminator. Aworking model of R.T.Sauer and coworkers (46) suggests that alanine-charged ssrA RNA enters the unoccupied P site of a stallednascent chain ribosome complex that has reached the 3� end of a truncated message, adding an alanine (unencoded) to the nascentchain. The ribosome then switches to translation of the ssrA RNA encoding the 10-residue adduct, which is added as an extension ofthe incomplete nascent chain. This COOH-terminal amino acid sequence comprises an element for recognition and proteolyticdegradation by ClpAP or ClpXP (47). In particular, when the ssrA peptide was added at the coding sequence level to λ represser (aminoacids 1–93), it led to rapid turnover of the fusion protein in vivo. Conversely, such fusion proteins were no longer rapidly degraded inClpP deletion mutants or in ClpA-ClpX double deletion mutants.

Interestingly, another signal for recognition by the ClpAP complex resides at the other, NH2-terminal, end of a potential set of testproteins that followed the N-end rule for degradation in bacteria (48). The presence of arginine, lysine, leucine, phenylalanine, tyrosine,or tryptophan conferred short half-life (<2 min) on the test proteins exposing one of these residues at the NH2 terminus. This short half-life compares to proteins exposing other residues, measuring >10 hr. For proteins bearing the destabilizing residues, deletion of ClpAresulted in alteration of half-life to that of proteins bearing the stabilizing residues. So far, such observations have not been extended tophysiological substrates. The recognition event has not been reconstituted in vitro with purified ClpA and could possibly involveadditional factors. It remains a fascinating question as to how ClpA specifically recognizes the NH2-terminal residue in the test proteinstudied.

Recognition of the COOH-terminal sequences in substrates by ClpX (and likely ClpA as well) appears to be mediated through aCOOH-terminal domain in the chaperone, distal to the ATPase domain, that contains a tandem motif (49). The two motifs have beensuggested to resemble PDZ domains, which are modular 100-residue structures shown to recognize COOH-terminal tetrapeptides witha characteristic primary sequence, X-Thr/Ser-X-Val-COO− (50–52). The similarity remains, however, to be established by structuralstudies. Nevertheless, when one or both of these motifs of ClpX were expressed independently, they were able to efficiently bind anArc-MuA fusion bearing the COOH-terminal 10-residue sequence of MuA (LEQNRRKKAI), which is required for recognition anddisassembly of tetrameric MuA by ClpX (21). Likewise, the isolated motifs recognized an Arc-ssrA fusion. The COOH-terminaltetrapeptide sequences recognized by PDZ domains are unstructured until they become bound, whereupon they are incorporated as anadditional β-strand at the edge of a sheet in the PDZ domain. Consistently, here, the COOH-terminal region of MuA in an Arc-MuAfusion was shown to be unstructured in one-dimensional NMR studies whereas the NH2-terminal Arc region behaved as a nativestructure. In sum, the ssrA and MuA COOH-terminal recognition systems in bacteria bear some degree of resemblance to the ubiquitinsystem, with the signals themselves not leading to dissociation/unfolding of the substrate protein until the signal recruits the protein tothe cap structure. Baker and coworkers have proposed that the tandem substrate recognition domains of ClpX may disassembleoligomeric substrates by forming two

CHAPERONE RINGS IN PROTEIN FOLDING AND DEGRADATION 11036

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 84: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

points of contact with separate subunits of the tetramer; subsequent ATP-directed conformational change of the ClpX might then pryapart the subunits (49).

The mechanism of recognition of other targets of ClpA and ClpX is less clear. Surprisingly, though, dimers of the plasmid P1initiator protein, RepA, can associate with unassembled ClpA subunit monomers or dimers (under conditions of absence of nucleotide;assembly of ClpA into hexamer requires adenine nucleotide). These complexes, however, are unstable (53). By contrast, complexesformed with ClpA hexamer at 23°C in ATPγS were stable and “committed,” such that they could only release RepA after exposure toATP, discharging it as the DNA binding-competent monomer.

In the case of chaperonins, it is clear that specific primary sequences in non-native substrate proteins are not involved withrecognition but, rather, that structures with exposed hydrophobic surfaces, such as collapsed states that can bind in the central cavity ofthe chaperonin, are recognized (see ref. 29). Binding seems likely to be multivalent; i.e., it involves multiple contacts between thepolypeptide and the surrounding apical domains. There is some uncertainty concerning whether the action of polypeptide binding isassociated with partial unfolding of kinetically trapped substrate proteins. Whereas the small protein, barnase, can be transientlyglobally unfolded by GroEL (54), other natural substrate proteins that require the complete GroEL/GroES/ATP system for folding arenot subject to global unfolding in association with binding, as determined by deuterium exchange experiments (refs. 55 and 56, andS.Walter, personal communication). More generally, GroEL may favor binding of less-folded states and, as such, may shift anequilibrium between non-native species toward less-folded ones (57).

Action of ATP

The role of ATP, for both chaperonins and the ring chaperones involved in proteolysis, is to galvanize the components into anactive association with their respective cochaperonins and proteolytic cylinders and to commence the particular actions of folding ordegradation. The specific roles of ATP binding and hydrolysis are better understood for chaperonins but are beginning to be dissectedfor the proteolytic complexes as well. In the case of both machineries, it appears that ATP binding is sufficient to drive formation of theactive complexes. In the case of GroEL, for example, binding of ATP to a ring occurs cooperatively atallseven sites (58, 59) andenables rapid and high-affinity binding of GroES to the same ring (60). This association is accompanied by the large conformationalchanges mentioned above, which are associated with release, possible transient unfolding (80), and subsequent folding of a boundpolypeptide in the encapsulated cavity of the ring (3, 30, 61). At the same time, binding of ATP to one GroEL ring is specificallyanticooperative for binding of ATP in the opposite ring (62), and, because ATP occupancy is required for efficient GroES binding, thissets up the inherent asymmetry of the chaperonin system, such that only one ring is folding-active at a time (refs. 30 and 63; see Fig. 3).

FIG. 3. GroEL-GroES reaction cycle-rings alternate in formation of folding-active cis ternary complexes. Folding is triggeredwhen ATP and GroES bind to the same (cis) ring as polypeptide, releasing it into the GroES-encapsulated, enlarged, and nowhydrophilic cavity. This very stable complex is the longest-lived state of the chaperonin system in the presence of non-nativepolypeptide (63), and it is weakened and prepared for dissociation by hydrolysis in the cis ring, which allows entry of ATPand non-native polypeptide into the trans ring (30). These in turn accelerate the dissociation of the cis ligands, includingpolypeptide. GroES binds to the ATP/polypeptide-liganded trans ring, completing formation of a new cis complex on thisring. Thus, GroEL rings alternate back and forth as folding-active (see text for additional detail).

In the case of the proteasome, ATP binding drives the stable association of cap structures at both ends of the catalytic cylinder, astep that appears to be cooperative (6, 64). In the case of the ClpA chaperone, the presence of ATP or a nonhydrolyzable analogue,such as ATPγS, is required for stable assembly of the hexamer ring and for its association with ClpP (65). Yet here, in contrast with thechaperonin system, ATP binding alone is probably not sufficient to initiate proteolysis. For example, in the presence of ATPγS, theClpA substrate, RepA, remained stably bound by ClpA hexamer (53). Only on subsequent addition of ATP was it released as monomer.Concordantly, RepA was not degraded in the presence of ATPγS, ClpA, and ClpP. These observations suggest that ATP hydrolysis islikely to be required for both actions associated with RepA proteolysis: dissociation of the RepA subunits from each other andunfolding/translocation. Nevertheless, it remains possible that ATPγS fails to produce the same stereochemistry of binding andresultant allosteric effects as ATP. This has proven to be the case for the GroEL system, for example, where AMP-PNP was found to beable neither to promote folding in association with formation of the cis complex nor to productively discharge the cis ligands from a cisADP ternary complex on binding to the trans ring (30).

CHAPERONE RINGS IN PROTEIN FOLDING AND DEGRADATION 11037

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 85: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

In contrast, both of these actions were readily promoted by binding of ATP, in the absence of hydrolysis (see Fig. 3). Thestereochemical differences between ATP and its analogues were revealed by study of GroEL with a catalytic mutation that did notaffect its affinity for ATP but reduced turnover to a level �2% of wild-type. Additionally, real-time substrate fluorescencemeasurements with wild-type GroEL in the presence of ATP/GroES showed immediate onset of fluorescence changes reflective offolding well before any cis ATP hydrolysis occurred (30). Yet with the proteolytic assistants, it seems more likely that hydrolysis isrequired, because multiple turnovers of the chaperone ring seem likely to be necessary to unfold and translocate the protein substrateinto the proteolytic cylinder. Further studies, such as single ATP turnover experiments, will be necessary to address such action.

The nature of the conformational change exerted on the ClpA ring by ATP binding and hydrolysis is as yet unknown.Correspondingly, it is unclear whether the ATP-mediated conformational change of ClpA affects ClpP, either directly, by opening itsorifice, for example, or indirectly, by allosterically influencing its active sites (see ref. 66 for discussion). In the case of the proteasomecap, a “wagging” or rocking motion of the 19S cap relative to the 20S core has been reported, although whether this is linked to ATPbinding and turnover has yet to be resolved (10). Thus, for these components, the means by which ATP turnover translates intounfolding and translocation of substrate remains to be seen. Notably, ATP turnover proceeds in these rings regardless of whether theproteolytic cylinder or substrate polypeptide is associated.

There seem to be two general models for how ATP-driven translocation between chaperone ring and protease might occur. Oneinvolves ATP-dependent unfolding of substrate in the chaperone ring, associated with threading of an extended polypeptide chainthrough a narrow passage into the proteolytic cylinder which itself engages and may “pull” the substrate via contacts with it, degradingit in a more or less processive manner. Such translocation resembles that of ER and mitochondrial precursor proteins traversing Sec andTom/Tim membrane complexes in extended states. In those settings, an Hsp70 chaperone functions with expenditure of ATP to“ratchet” or “motor” the chain across the membrane from inside. Here, ATP is expended “behind” the passageway into the proteolyticcylinder, suggesting that a “pushing” action could be involved. This more closely resembles the system of export through the bacterialmembrane, where the SecA chaperone utilizes ATP (binding) to drive a segment of both itself and polypeptide substrate through thetranslocon (67). In the case of the proteolytic cylinders, alternatively, perhaps the energy of peptide bond cleavage at the inside aspectcould be coupled to forward movement. A second model invokes an ATP-directed conformational change that associates unfolding ofsubstrate in the chaperone ring with a conformational switch amounting to opening of a “trap door” into the proteolytic component thatallows the entire substrate, or perhaps domain-sized portions, to drop into the proteolytic cylinder for multipoint proteolytic processing.This would presumably involve opening of both the axial exitway in the base of the chaperone ring and the entryway of the proteolyticcylinder. Although it seems clear, for example, that the 20S proteasome must be gated, as yet the gate has not been observed in an openstate, so the size of the opening is unknown.

It is also unclear whether nucleotide binding and turnover in the proteolytic chaperone rings is cooperative or synchronous orwhether it occurs in some sequential manner that could be linked to rotational motion. In the case of chaperonins, cooperative ATPbinding is used within a ring to enable it to function as a uniform 7-fold symmetric unit in binding the 7-fold symmetric GroEScochaperone. ATP hydrolysis is used in a folding-active ring to weaken the stable association of GroES with the ring (ref. 30; seeFig. 3). Such weakening “primes” the ring for dissociation that is allosterically triggered by binding of ATP and non-native polypeptidein the open opposite (trans) ring (refs. 30 and 63; see Fig. 3). At the same time that it primes the cis ring, cis hydrolysis sends anallosteric signal to the trans ring that adjusts its apical domains from a conformational orientation that cannot accept ligands to one thatis now fully open and available for binding (ref. 63; see Fig. 3). Thus, ATP hydrolysis allosterically primes the GroEL machine toswitch rings, signaling the end of a folding reaction in one ring and the preparation for a new one on the opposite ring.

In the case of GroEL, but also the proteolytic chaperones, there seems to be no requirement at any point in the cycle to producesymmetric complexes bearing a ring at both ends of the assembly simultaneously (63, 65, 68). For example, a second GroES does nothave to bind to discharge the one present in a folding-active complex (30, 69, 70). Recent kinetic studies indicate, in fact, that GroEScannot bind to an available ATP-bound trans ring until a slow transition occurs within the cis ADP chaperonin complex that leads to thedeparture of the cis GroES (63). Thus, at most, one GroES is arriving while the other is departing. By contrast, stable 2:1 assemblies ofchaperone-protease complexes can be isolated from cells and are readily formed in vitro; yet, these assemblies appear to be no morecatalytically active than 1:1 complexes (65, 68), raising such issues as whether only one side of a 2:1 complex can be occupied withsubstrate and proteolytically active at any time, whether there is alternation between sides, and how occupancy of one side could inhibitaccess of substrate to, proteolytic activity of, or departure of product peptides from the other side.

Commitment of Substrate

In the case of RepA dimer, a single round of association with ClpA followed by ATP-mediated release is sufficient to produce theDNA binding-competent RepA monomer (53). This was demonstrated either by supplying an excess of casein substrate competitor thatwould block rebinding of RepA or by diluting the RepA-ClpA binary complex formed in ATPγS before ATP addition, such thatrebinding would be disfavored. These studies indicate commitment of dimeric RepA substrate to dissociation into monomers in oneround of interaction with the chaperone. Such committed behavior with respect to substrate protein differs considerably from that ofchaperonins, which appear to eject substrate proteins after a timed period of folding in the cis chamber, regardless of whether substratehas reached the native state or not (see ref. 29; Fig. 3). For many substrate proteins, this results in a requirement for multiple rounds ofrelease and rebinding by chaperonin in order for a population of molecules to reach the native state. Such release of non-native formsallows a kinetic partitioning to occur, whence non-native forms not only can be rebound by chaperonin but may be recognized by otherchaperones, or even proteases (70–73). This prevents the chaperonin system from becoming engorged with misfolded or defectiveproteins that are not able to reach the native state. This behavior, attractive for chaperonins, would not be as appealing for theproteolytic system—in general, the release of partly degraded proteins would have little or no functional value.

On the other hand, it has been suggested that some substrates may be only partially processed by the proteasome: for example, thetranscription factor NF- B (74). It has been shown that removal of a COOH-terminal cytoplasmic anchoring domain from the NF Bprecursor protein, p105, enables the NH2-terminal domain to enter the nucleus to activate transcription. One model for how thisoccurs.involves preferential unfolding and translocation of the COOH-terminal domain into the proteolytic cylinder, with action of thepro-

CHAPERONE RINGS IN PROTEIN FOLDING AND DEGRADATION 11038

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 86: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

teasome somehow aborted before the NH2-terminal transcriptional activation domain enters the cylinder. A resistance of the NH2-terminal domain to unfolding may underlie its resistance to degradation, but there seems also the possibility that removal of a ubiquitintag from the polypeptide could provide a means of escape, as discussed below. Alternative to partial processing, however, is a model inwhich initial cleavage by an endopeptidase separates the two domains and is followed by proteasome-mediated degradation of theCOOH-terminal fragment (75).

Although the proteolytic system appears generally to be a committed one, it seems to have evolved a fail-safe mechanism or“editor” that prevents inappropriate commitment to turning over potentially active substrate proteins. The PA700 isopeptidase, anintegral component of the mammalian 19S cap, enables removal of ubiquitin monomers from polyubiquitinated proteins, affording thechance to rescue those proteins that bear only short lengths of ubiquitin chains (ref. 76; see also ref. 77). The bias against degradation ofshort chains would predispose the proteasome to preferentially degrade proteins that have efficiently interacted with E3 enzymes toproduce processive formation of long chains. Given that polyubiquitin chains are disassembled by the proteasome from their distal ends(76), translocation into the proteolytic cylinder of substrates carrying long polyubiquitin chains is likely to be favored kinetically overchain removal. On the other hand, those proteins with only short ubiquitin chains can be relieved of them and thus protected fromproteasomal turnover.

Isopeptidases are also crucial for the recycling of ubiquitin, insofar as ubiquitin itself is not degraded by the proteasome. So, forboth aborted and “productive” substrates, there may be an obligatory, but presumably late, step of ubiquitin removal. Indeed, ubiquitinremoval is probably also necessary to allow the entire substrate polypeptide to pass through the channel into the proteolytic cylinderbecause the folded state of ubiquitin is remarkably stable (e.g., ref. 78), and it is thus likely to present a barrier to translocation. Inaddition, the ubiquitin chain is likely to be inaccessible to translocation because it is anchored to the ubiquitin receptor of the regulatoryparticle. These considerations suggest that the last key step in the proteasome’s reaction cycle occurs within the regulatory particle andconsists in the release of the ubiquitin chain from the substrate. This step is presumably operative forallsubstrates, both “typical” onesas well as “nonsubstrates” that carry too few ubiquitin groups and are released after being edited by the PA700 isopeptidase.

Prospects for Further Mechanistic Understanding

In the short term, we can look forward to crystallographic views of states of the proteolytic chaperone rings, which will allowdeductions about ATP-mediated unfolding and translocation, and further mechanistic studies. For both the chaperonin and proteolyticring systems, however, attention must focus ultimately on the fate of the substrate polypeptide. This represents a challenge in bothcases because the substrate occupies, or comes to occupy in the case of the proteolytic machines, a non-native conformation that doesnot exhibit the structural order and symmetry of the machines themselves. Indeed, substrates seem likely to occupy an ensemble ofconformations, as compared with the uniformity from molecule to molecule of the states of the machines. Understanding this systemwill require facing some of the same problems that the chaperonin system currently confronts related to the location and conformationof substrate during binding and folding, both of which are difficult to examine at high resolution. Spectroscopic approaches seem likelyto yield the most definitive answers but will be stretched to their limits to get at these questions.

We thank W.Fenton for critical reading of the manuscript. E.U.W. is supported by a Jane Coffin Childs Fellowship and A.L.H. bythe Howard Hughes Medical Institute.1. Baumeister, W., Walz, J., Zühl & Seemüller E. (1998) Cell 92, 367–380.2. Bukau, B. & Horwich, A.L. (1988) Cell 92, 351–366.3. Xu, Z., Horwich, A.L. & Sigler, P.B. (1997) Nature (London) 388, 741–750.4. Ditzel, L., Löwe, J., Stock, D., Stetter, K.-O., Huber, H., Huber, R. & Steinbacher, S. (1998) Cell 93, 125–138.5. Lewis, V.A., Hynes, G. M, Zheng, D., Saibil, H. & Willison, K. (1992) Nature (London) 358, 249–252.6. Beuron, F., Maurizi, M.R., Belnap, D.M., Kocsis, E., Booy, F.P., Kessel, M. & Steven, A.C. (1998) J. Struct. Biol 123, 248–259.7. Grimaud, R., Kessel, M., Beuron, F., Steven, A.C. & Maurizi, M.R. (1998) J. Biol Chem. 273, 12476–12481.8. Rohrwild, M., Pfeifer, G., Santarius, U, Müller, S. A, Huang, H.-C, Engel, A, Baumeister, W. & Goldberg, A.L. (1997) Nat. Struct. Biol. 4, 133–139.9. DeMartino, G.N., Moomaw, C.R., Zagnitko, O.P., Proske, R.J., Chu-Ping, M., Afendis, S.J., Swaffield, J.C. & Slaughter, C.A. (1994) J. Biol Chem.

269, 20878–20884.10. Walz, J., Erdmann, A., Kania, M., Typke, D., Koster, A.J. & Baumeister, W. (1998) J. Struct. Biol. 121, 19–29.11. Larsen, C.N. & Finley D. (1997) Cell 91, 431–434.12. Confalonieri, F. & Duguet, M. (1995) BioEssays 17, 639–650.13. Boisvert, D.C., Wang, J., Otwinowski, Z., Horwich, A.L. & Sigler, P.B. (1996) Nat. Struct. Biol. 3, 170–177.14. Hwang, B.J., Woo, K.M., Goldberg, A.L. & Chung, C.H. (1998) J. Biol. Chem. 263, 8727–8734.15. Arrigo, A.-P., Tanaka, K., Goldberg, A.L. & Welch, W.J. (1988) Nature (London) 331, 192–194.16. Chu-Ping, M., Vu, J.H., Proske, R.J., Slaughter, C.A. & DeMartino, G.N. (1994) J. Biol. Chem. 269, 3539–3547.17. Wickner, S., Gottesman, S., Skowyra, D., Hoskins, J., McKenney, K. & Maurizi, M.R. (1994) Proc. Natl. Acad. Sci. USA 91, 12218–12222.18. Wawrzynow, A, Wojtkowiak, D., Marszalek, J., Banecki, B., Jonsen, M., Graves, B., Georgopoulos, C. & Zylicz, M. (1995) EMBO J. 9, 1867–1877.19. Levchenko, I., Luo, L. & Baker, T.A. (1995) Genes Dev. 9, 2399–2408.20. Glover, J.R. & Lindquist, S. (1998) Cell 94, 73–82.21. Levchenko, I., Yamauchi, M. & Baker, T.A. (1997) Genes Dev. 11, 1561–1572.22. Leonhard, K., Herrmann, J.M., Stuart, R.A, Mannhaupt, G., Neupert, W. & Langer, T. (1996) EMBO J. 15, 4218–4229.23. Arlt, H., Tauer, R., Feldmann, H., Neupert, W. & Langer, T. (1996) Cell 85, 875–885.24. Gottesman, S., Maurizi, M.R. & Wickner, S. (1997) Cell 91, 435–438.25. Suzuki, C.K., Rep, M., Maarten van Dijl, J., Suda, K., Grivell, L.A. & Schatz, G. (1997) Trends Biochem. Sci. 22, 118–122.26. Arlt, H., Steglich, G., Perryman, R., Guiard, B., Neupert, W. & Langer, T. (1998) EMBO J. 17, 4837–4847.27. Braig, K., Otwinowski, Z., Hegde, R., Boisvert, D.C., Joachimiak, A., Horwich, A.L. & Sigler, P.B. (1994) Nature (London) 371, 578–586.28. Chen, S., Roseman, A.M., Hunter, A.S., Wood, S.P., Burston, S.G., Ranson, N.A., Clarke, A.R. & Saibil, H.R. (1994) Nature (London) 371, 261–264.29. Fenton, W.A. & Horwich, A.L. (1997) Protein Sci. 6, 743–760.30. Rye, H.S., Burston, S.G., Fenton, W. A, Beechem, J.M., Xu, Z., Sigler, P.B. & Horwich, A.L. (1997) Nature (London) 388, 792–798.31. Wang, J., Hartling, J.A. & Flanagan, J.M. (1997) Cell 91, 447–456.32. Groll, M., Ditzel, L., Löwe, J., Stock, D., Bochtler, M., Bartunik, H.D. & Huber, R. (1997) Nature (London) 386, 463–471.33. Rubin, D.M., Glickman, M.H., Larsen, C.N., Dhruvakumar, S. & Finley, D. (1998) EMBO J. 17, 4909–4919.34. Glickman, M.H., Rubin, D.M., Fried, V.A. & Finley, D. (1998) Mol Cell. Biol. 18, 3149- 3162.35. Bochtler, M., Ditzel, L., Groll, M. & Huber, R. (1997) Proc. Natl. Acad. Sci. USA 94, 6070–6074.

CHAPERONE RINGS IN PROTEIN FOLDING AND DEGRADATION 11039

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 87: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

36. Löwe, J., Stock, D., Jap, B., Zwickl, P., Baumeister, W. & Huber, R. (1995) Science 268, 533–539.37. Glickman, M.H., Rubin, D.M., Coux, O., Wefes, I., Pfeifer, G., Cjeka, Z., Baumeister, W., Fried, V.A. & Finley, D. (1998) Cell 94, 615–623.38. Varshavsky, A. (1997) Trends Biochem. Sci. 22, 383–387.39. Johnston, J.A., Johnson, E.S., Waller, P.R.H. & Varshavsky, A. (1995) J. Biol Chem. 270, 8172–8178.40. Gottesman, S., Clark, W.P., Crecy-Lagard, V. & Maurizi, M.R. (1993) J. Biol. Chem. 268, 22618–22626.41. Ciechanover, A. (1998) EMBO J. 17, 7151–7160.42. Deveraux, Z., Ustrell, V., Pickart, C. & Rechsteiner, M. (1994) J. Biol. Chem. 269, 7059–7061.43. van Nocker, S., Sadis, S., Rubin, D.M., Glickman, M.H., Fu, H., Coux, O., Wefes, I., Finley, D. & Vierstra, R.D. (1996) Mol. Cell. Biol. 16, 6020–

6028.44. Fu, H., Sadis, S., Rubin, D.M., Glickman, M., van Nocker, S., Finley, D. & Vierstra, R.D. (1998) J. Biol. Chem. 273, 1970–1981.45. Tu, G.-F., Reid, G.E., Zhang, J.-G., Moritz, R.L. & Simpson, R.J. (1995) J. Biol. Chem. 270, 9322–9326.46. Keiler, K.C., Waller, P.R.H. & Sauer, R.T. (1996) Science 271, 990–993.47. Gottesman, S., Roche, E., Zhou, Y.N. & Sauer, R.T. (1998) Genes Dev. 12, 1338–1347.48. Tobias, J.W., Shrader, T.E., Rocap, G. & Varshavsky, A. (1991) Science 254, 1374–1377.49. Levchenko, L, Smith, C.K., Walsh, N.P., Sauer, R.T. & Baker, T.A. (1997) Cell 91, 939–947.50. Kim, E., Niethammer, M., Rothschild, A., Jan, Y.N. & Sheng, M. (1995) Nature (London) 378, 85–88.51. Kornau, H.-C., Schenker, L.T., Kennedy, M.B. & Seeburg, P.H. (1995) Science 269, 1737–1740.52. Doyle, D.A., Lee, A., Lewis, J., Kim, E., Sheng, M. & MacKinnon, R. (1996) Cell 85, 1067–1076.53. Pak, M. & Wickner, S. (1997) Proc. Natl. Acad. Sci. USA 94, 4901–4906.54. Zahn, R., Perrett, S., Stenberg, G. & Fersht, A.R. (1996) Science 271, 642–645.55. Gross, M., Robinson, C.V., Mayhew, M., Hartl, F.U. & Radford, SE (1996) Protein Sci. 5, 2506–2513.56. Goldberg, M.S., Zhang, J., Matthews, C.R., Fox, R.O. & Horwich, A.L. (1997) Proc. Natl. Acad. Sci. USA 94, 1080–1085.57. Walter, S., Lorimer, G.H. & Schmid, F.X. (1996) Proc. Natl. Acad. Sci. USA 93, 9425- 9430.58. Gray, T.E. & Fersht, A.R. (1991) FEBS Lett. 292, 254–258.59. Bochkareva, E.S., Lissin, N.M., Flynn, G.C., Rothman, J.E. & Girshovich, A.S. (1992) J. Biol Chem. 267, 6796–6800.60. Jackson, G.S., Staniforth, R.A., Halsall, D.J., Atkinson, T., Holbrook, J.J., Clarke, A.R. & Burston, S.G. (1993) Biochemistry 32, 2554–2563.61. Kad, N.M., Ranson, N.A., Cliff, M.J. & Clarke, A.R. (1998) J. Mol Biol 278, 267–278.62. Yifrach, O. & Horovitz, A. (1995) Biochemistry 34, 5303–5308.63. Rye, H.S., Roseman, A.M., Chen, S., Furtak, K., Fenton, W.A., Saibil, H.R. & Horwich, A.L. (1999) Cell 97, 325–338.64. Adams, G.M., Falke, S., Goldberg, A.L., Slaughter, C.A., DeMartino, G.N. & Gogol, E.P. (1997) J. Mol Biol 273, 646–657.65. Maurizi, M.R., Singh, S.K., Thompson, M.W., Kessel, M. & Ginsburg, A. (1998) Biochemistry 37, 7778–7786.66. Gottesman, S., Wickner, S. & Maurizi, M.R. (1997) Genes Dev. 11, 815–823.67. Economou, A. & Wickner, W. (1994) Cell 78, 835–843.68. Adams, G.M., Crotchett, B., Slaughter, C.A., DeMartino, G.N. & Gogol, E.P. (1998) Biochemistry 37, 12927–12932.69. Hayer-Hartl, M., Martin, J. & Hartl, F.-U. (1995) Science 269, 836–841.70. Burston, S.G., Weissman, J.S., Farr, G.W., Fenton, W.A. & Horwich, A.L. (1996) Nature (London) 383, 96–99.71. Farr, G.W., Scharl, E.C., Schumacher, R.J., Sondek, S. & Horwich, A.L. (1997) Cell 89, 927–937.72. Ranson, N.A., Burston, S.G. & Clarke, A.R. (1997) J. Mol. Biol. 266, 656–664.73. Kandror, O., Busconi, L., Sherman, M. & Goldberg, A.L. (1994) J. Biol Chem. 269, 23575–23582.74. Palombella V.J., Rando, O.J., Goldberg, A.L. & Maniatis, T. (1994) Cell 78, 773–785.75. Lin, L. & Ghosh, S. (1996) Mol. Cell Biol. 16, 2248–2254.76. Lam, Y.A., Xu, W., DeMartino, G.N. & Cohen, R.E. (1997) Nature (London) 385, 737–740.77. Hochstrasser, M. (1996) Annu. Rev. Genet. 30, 405–439.78. Cary, P.D., King, D.S., Crane-Robinson, C., Bradbury, E.M., Rabbani, A., Goodwin, G.H. & Johns, E.W. (1980) Eur. J. Biochem. 112, 577–580.79.Weber-Ban, E.U., Reid, B.G., Miranker, A.D. & Horwich,A. L. (1999) Nature (London), in press.80. Shtilerman, M., Lorimer, G.H. & Englander, S.W. (1999) Science 284, 822–825.

CHAPERONE RINGS IN PROTEIN FOLDING AND DEGRADATION 11040

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 88: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

A proteolytic pathway that controls the cholesterol content ofmembranes, cells, and blood

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation,” held February 20–21, at the Arnold and Mabel Beckman Center in Irvine, CA.

(sterol regulatory element-binding proteins/transcription/Site-1 protease/Site-2 protease/sterol-sensing domain)

MICHAEL S. BROWN* AND JOSEPH L. GOLDSTEIN*Department of Molecular Genetics, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX

75235ABSTRACT The integrity of cell membranes is maintained by a balance between the amount of cholesterol and the

amounts of unsaturated and saturated fatty acids in phospholipids. This balance is maintained by membrane-bound transcription factors called sterol regulatory element-binding proteins (SREBPs) that activate genes encoding enzymes of cholesterol and fatty acid biosynthesis. To enhance transcription, the active NH2-terminal domains of SREBPs are released from endoplasmic reticulum membranes by two sequential cleavages. The first is catalyzed by Site-1 protease (S1P), a membrane-bound subtilisin-related serine protease that cleaves the hydrophilic loop of SREBP that projects into the endoplasmic reticulum lumen. The second cleavage, at Site-2, requires the action of S2P, a hydrophobic protein that appears tobe a zinc metalloprotease. This cleavage is unusual because it occurs within a membrane-spanning domain of SREBP. Sterolsblock SREBP processing by inhibiting S1P. This response is mediated by SREBP cleavage-activating protein (SCAP), aregulatory protein that activates S1P and also serves as a sterol sensor, losing its activity when sterols overaccumulate in cells.These regulated proteolytic cleavage reactions are ultimately responsible for controlling the level of cholesterol in membranes,cells, and blood.

Cholesterol has long been known to play an important role in modulating fluidity and phase transitions in the plasma membranesof animal cells (1). Recently, a new role for cholesterol has been appreciated. Cholesterol, together with sphingomyelin, forms plasmamembrane rafts or caveolae that are sites where signaling molecules are concentrated (2, 3). To perform these functions, membranecholesterol must be maintained at a constant level. This homeostasis is achieved by a feedback regulatory system that senses the levelof cholesterol in cell membranes and modulates the transcription of genes encoding enzymes of cholesterol biosynthesis and uptakefrom plasma lipoproteins. The modulators are a family of membrane-bound transcription factors called sterol regulatory element-binding proteins (SREBPs), which must be released proteolytically from membranes to act (4). This article summarizes recent progressin understanding the SREBPs and the sterolregulated proteases that release them.

Three SREBPs are currently recognized. Two are produced from a single gene through the use of alternate promoters that producetranscripts with different first exons (5). The cDNAs for these proteins, designated as SREBP-1a and SREBP-1c, were cloned fromhuman and mouse cells (6–8). SREBP-1c was cloned independently from rat adipocytes and was designated ADD-1 (9). The thirdisoform, SREBP-2 is produced from a separate gene (5, 10).

The SREBPs are three-domain proteins of �1,150 amino acids that are bound to membranes of the endoplasmic reticulum (ER)and nuclear envelope in a hairpin orientation (4) (see Fig. 1). The NH2-terminal domain of �480 amino acids and the COOH-terminaldomain of �590 amino acids project into the cytosol. They are anchored to membranes by a central domain of �80 amino acids thatcomprises two membrane-spanning sequences separated by a short 31-aa loop that projects into the lumen of the ER and nuclearenvelope.

The NH2-terminal domains of SREBPs are transcription factors of the basic-loop-helix-leucine zipper (bHLH-Zip) family (4, 11).The extreme NH2 terminus contains a stretch of acidic amino acids that recruits transcriptional coactivators, including CBP (12). InSREBP-1a and SREBP-2, these acidic sequences are relatively long. In SREBP-1c, the acidic sequence is shorter, and this protein is amuch weaker activator than the other two SREBPs (7, 8, 13). The NH2-terminal domains ofallthree SREBPs also contain a bHLH-Zipmotif that mediates dimerization, nuclear entry, and DNA binding. Within the basic region of this motif, the SREBPs contain a tyrosinein place of an arginine that is conserved in nearly aII of the other bHLH family members (11, 14). This substitution allows SREBPs torecognize decanucleotide segments of DNA called sterol regulatory elements (SREs) (14). In contrast to the usual binding sites forbHLH proteins, which are palindromic, SREs are nonpalindromic, and they usually contain one or two copies of the sequence CAC (6,11). When tested for binding activity against random sequences of DNA (14), the SREBPs show a strong preference for the SREsequence that was originally defined in the enhancers of the genes encoding the low density lipoprotein (LDL) receptor and 3-hydroxy-3-methylglutaryl CoA (HMG-CoA) synthase, namely, TCACCCCACT (15, 16). In other promoters, the SREBPs recognizedifferent sequences, and a clear consensus has not been defined (17).

In sterol-depleted cells, the NH2-terminal domains of the SREBPs are released from membranes by two sequential proteolyticcleavages that must occur in the proper order (18). The NH2-terminal domain then travels to the nucleus, where it binds to SREs in theenhancers of multiple genes encoding enzymes of cholesterol biosynthesis, unsaturated fatty acid biosynthesis, triglyceridebiosynthesis, and lipid uptake (reviewed in ref. 19). In the cholesterol biosynthetic pathway, well defined target genes include HMG-CoA synthase, HMG-CoA

*E-mail: [email protected] or [email protected] is available online at www.pnas.org.Abbreviations: bHLH-Zip, basic-helix-loop-helix-leucine zipper; CHO, Chinese hamster ovary; ER, endoplasmic reticulum;

HMGCoA, 3-hydroxy-3-methylglutaryl CoA; LDL, low density lipoprotein; PLAP, placental alkaline phosphatase; SRE, sterolregulatory element; SREBP, sterol regulatory element-binding protein; SCAP, SREBP cleavage-activating protein; S1P; Site-1protease; S2P; Site-2 protease.

A PROTEOLYTIC PATHWAY THAT CONTROLS THE CHOLESTEROL CONTENT OF MEMBRANES, CELLS, AND BLOOD 11041

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 89: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

reductase, farnesyl diphosphate synthase, and squalene synthase (20). The targets in the fatty acid and triglyceride biosyntheticpathways include acetyl CoA carboxylase, fatty acid synthase, stearoyl CoA desaturase, and glycerol-3-phosphate acyltransferase (4,17, 20). The SREBPs also enhance transcription of the LDL receptor, which mediates cholesterol uptake from plasma lipoproteins.Overexpression of the NH2-terminal nuclear domains of SREBPs also elevates mRNAs encoding many other enzymes required forlipid synthesis, including enzymes that generate acetyl CoA and reduced pyridine nucleotides (21).

When sterols build up within cells, the proteolytic release of SREBPs from membranes is blocked. The NH2-terminal domains thathave already entered the nucleus are rapidly degraded in a process that is blocked by inhibitors of proteasomes (22). As a result of theseevents, transcription ofallof the target genes declines. This decline is complete for the cholesterol biosynthetic enzymes, whosetranscription is entirely dependent on SREBPs. The decline is less complete for the fatty acid biosynthetic enzymes whose basaltranscription can be maintained by other factors (13, 23).

Two-Step Proteolytic Release of SREBPs

The two-step proteolytic release of the NH2-terminal domains is illustrated schematically in Fig. 1. The process begins when aprotease, termed Site-1 protease (S1P), cleaves the SREBPs at a site within the hydrophilic loop that projects into the lumen of the ER(Fig. 1 Top). In SREBP-2, this cleavage occurs between the leucine and serine of the sequence RSVLS (24). S1P absolutely requires abasic residue at the P4 position, and it strongly prefers a leucine at the P1 position. The residues at the P2, P3, and P1� positions can besubstituted freely without affecting cleavage (24).

FIG. 1. Model for the sterol-mediated proteolytic release of SREBPs from membranes. (Top) Release is initiated by Site-1protease (S1P), a sterol-regulated protease that recognizes the SCAP/ SREBP complex and cleaves SREBP in the luminalloop between two membrane-spanning sequences. SCAP allows Site-1 cleavage to be activated when cells are deprived ofsterols, and it inhibits this process when sterols are abundant. (Middle) Once the two halves of SREBP are separated, asecond protease, Site-2 protease (S2P), cleaves the NH2-terminal bHLH-Zip domain of SREBP at a site located within themembrane-spanning region. (Bottom) After the second cleavage, the NH2-terminal bHLH-Zip domain leaves the membrane,carrying three hydrophobic residues at its COOH-terminus. The protein enters the nucleus, where it activates target genescontrolling lipid synthesis and uptake.

Cleavage by S1P separates the SREBPs into two halves, both of which remain membrane-bound (Fig. 1 Middle). The separationcan be detected by immunoprecipitation experiments; after cleavage, an antibody against the COOH-terminal domain no longerprecipitates the membranebound NH2-terminal domain. The membrane-bound NH2-terminal domain is termed the intermediatefragment of SREBP (18).

After the two halves of the SREBP have separated, a second protease, designated Site-2 protease (S2P), cleaves the NH2-terminalintermediate fragment at a site that is just within its membrane-spanning domain (Fig. 1 Middle). In SREBP-2, this cleavage occursbetween the leucine and cysteine of the sequence DRSRILLC (25). The second arginine of this sequence is believed to represent theboundary between the hydrophilic NH2-terminal domain and the hydrophobic membrane-spanning segment. Thus, the cleavage occursthree residues within the membrane-spanning segment. When the NH2-terminal fragment leaves the membrane to enter the nucleus, itcarries the three hydrophobic ILL residues at its COOH-terminus (Fig. 1 Bottom). Studies of intact cells showed that recognition byS2P requiresallor part of the DRSR sequence. The exact recognition sequence has not been defined. Each of the ILLC residues can bereplaced singly with alanines without affecting cleavage (25).

Sterols block the proteolytic release process by selectively inhibiting cleavage by S1P (Fig. 1 Top). Current evidence indicates thatS2P is not regulated directly by sterols, but it is regulated indirectly because the enzyme cannot act until the two halves of SREBP havebeen separated through the action of S1P (18).

SREBP Cleavage-Activating Protein (SCAP)

The first advance in understanding SREBP regulation came with the isolation of a cDNA encoding SREBP cleavage-activatingprotein (SCAP), a regulatory protein that is required for cleavage at Site-1 (26). SCAP is an integral membrane protein of 1,276 aminoacids with two distinct domains. The NH2-terminal domain of �730 amino acids consists of alternating hydrophilic and hydrophobicsequences that appear to form eight membrane-spanning helices (27). This domain anchors SREBP to membranes of the ER. TheCOOH-terminal domain of �550 amino acids projects into the cytosol. It contains five WD-repeats. Similar repeats, each �40 residuesin length, are found in many intracellular proteins, where they often mediate protein-protein interactions (28). The crystal structure ofone such protein, the β-subunit of heterotrimeric G proteins, revealed that the WD-repeats form the blades of a propeller-like structurethat bridges the α- and γ-subunits (29, 30).

Within cells, SCAP is found in a tight complex with SREBPs (31, 32). The association is mediated by an interaction between theCOOH-terminal regulatory domain of the SREBP and the WD-repeat domain of SCAP. Formation of this complex is required forSite-1 cleavage, as revealed by the following experiments (31, 32): (i) truncation of the COOH-terminal domain of SREBP-2 preventsinteraction with SCAP and abolishes susceptibility to cleavage by S1P; (ii) Overexpression of a cDNA encoding the membrane-anchored COOH-terminal domain of either SCAP or SREBP-2 competitively disrupts the formation of the complex between endogenous

A PROTEOLYTIC PATHWAY THAT CONTROLS THE CHOLESTEROL CONTENT OF MEMBRANES, CELLS, AND BLOOD 11042

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 90: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

SCAP and endogenous SREBP-2, and this abolishes Site-1 cleavage. This block can be overcome by overexpressing full-length SCAPor SREBP-2. Based on these findings, we hypothesized that the SCAP/SREBP complex is the true substrate for S1P (Fig. 1 Top).

SCAP as a Sterol Sensor

In addition to its requirement for Site-1 cleavage, SCAP is also the target for sterol suppression of this cleavage. This conclusionemerged from studies of mutant Chinese hamster ovary (CHO) cells that were selected for resistance to oxysterol-mediated feedbacksuppression of SREBP activity (26). When added to the medium surrounding cultured cells, certain oxysterols, including 25-hydroxycholesterol, block the Site-1 cleavage of SREBPs and thereby abolish cholesterol synthesis (4). These oxysterols cannot replacethe functions of cholesterol in cell membranes, and the cells therefore die unless they are given a usable exogenous source ofcholesterol. Oxysterol-resistant mutants survive under these conditions because they fail to respond to oxysterols by turning offcholesterol synthesis, and this forms the basis of a genetic selection (33).

Oxysterol-resistant mutant CHO cells fall into two complementation classes, both of which are genetically dominant. Class 1mutants are sterol-resistant because they produce a truncated form of SREBP-2 that encodes the complete NH2-terminal segment butterminates before the membrane attachment domain (34, 35). The truncated protein goes directly to the nucleus without a requirementfor proteolysis, and thus it cannot be suppressed by oxysterols.

Class 2 mutants produce normal full-length SREBP-1 and SREBP-2 and proteolyze them normally, but they cannot turn offproteolysis in response to sterol overload. We identified the defective gene in the Class 2 mutants by preparing a cDNA library fromone of the mutant cell lines, transfecting pools of cDNAs into cultured human embryonic kidney 293 cells, and assaying for a relief ofthe oxysterol-dependent inhibition of expression of a reporter gene driven by an SRE-containing promoter. One cDNA was found toconfer the oxysterol resistance phenotype, and this turned out to encode a mutant version of SCAP (26). The gene had undergone a C-to-G substitution, which changed amino acid 443 from aspartic acid to asparagine (Fig. 2). The identical point mutation was found intwo other independently isolated mutant cell lines (36). In a fourth cell line, a point mutation in the SCAP gene changed a tyrosine atamino acid 298 to cysteine (37) (Fig. 2). When any of these mutant SCAP cDNAs is transfected into wild-type cells, it abolishes thesusceptibility of S1P to inhibition by oxysterols, including 25-hydroxycholesterol (26). We interpret these findings to indicate thatsterols normally suppress S1P activity by interacting with SCAP, either directly or indirectly. The mutant forms of SCAP are resistantto sterol inhibition, and therefore they continue to facilitate S1P activity even when sterols are present. The ability of the mutant SCAPto act in the presence of oxysterols represents a gain of function, and this explains the dominant defect in the oxysterol-resistant cells.

FIG. 2. Membrane topology of SCAP, showing the location of two point mutations that produce a sterol-resistant phenotypein mutant cells. The yellow region denotes the putative sterol-sensing domain of SCAP.

FIG. 3. Membrane proteins that contain sterol-sensing domains. The identified proteins are Chinese hamster SCAP (1,276amino acids), Chinese hamster HMG-CoA reductase (887 amino acids), mouse Niemann-Pick type C1 (NPC1) (1,278 aminoacids), and mouse Patched (1,434 amino acids). The sterol-sensing domains of these proteins, denoted in yellow, correspondto the following residues: SCAP, amino acids 280–446; HMG-CoA reductase, amino acids 57–224; NPC1, amino acids 617–691; and Patched, amino acids 420–589. The sequence alignments of the four sterol-sensing domains are published in Fig. 2of ref. 37.

The remarkable aspect of the oxysterol-resistant forms of SCAP is that both of the sterol resistance mutations fall within a 160-aasegment of the membrane domain of SCAP (Fig. 2). This segment, which comprises five of the eight membrane-spanning sequences ofSCAP, has been termed the sterolsensing domain. A similar stretch of five membrane-spanning sequences has been identified in threeother proteins, each of which is influenced by cholesterol (Fig. 3). A sterol-sensing domain is found in the membrane attachment regionof the ER enzyme, HMG-CoA reductase (26). This domain is responsible for the enhanced degradation of HMG-CoA reductase thatoccurs when oxysterols are added to cells (38, 39). A similar sterol-sensing domain is found in the Niemann-Pick type C1 protein,which is required for the movement of LDL-derived cholesterol from the lysosome to the ER (40). A sterol-sensing domain also hasbeen identified in Patched, a polytopic membrane protein that serves as the receptor for the mor-

A PROTEOLYTIC PATHWAY THAT CONTROLS THE CHOLESTEROL CONTENT OF MEMBRANES, CELLS, AND BLOOD 11043

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 91: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

phogenic protein Hedgehog (41), the only known protein to which cholesterol is covalently attached (42). Whether the sterol-sensingdomains interact directly with sterols, or whether they recognize other proteins that are in turn influenced by sterols, is not known.

Candidate Gene for Site-2 Protease

In addition to yielding SCAP, somatic cell genetics has also yielded candidate genes for the Site-2 and Site-1 proteases. The firstof these, termed S2P, was isolated from a mutant line of CHO cells that is unable to produce LDL receptors, cholesterol biosyntheticenzymes, or fatty acid desaturases (43). The molecular defect was traced to a specific inability to carry out Site-2 cleavage of SREBPs(18, 44). The cells cleave the SREBPs at Site-1, but the NH2-terminal domain remains membrane-bound, owing to the failure ofcleavage at Site-2. These cells are therefore auxotrophs that require cholesterol and unsaturated fatty acids for growth.

Hasan et al. at Dartmouth (43) found that the defect in one cholesterol auxotrophic cell line (M19 cells) was recessive, and theycorrected the defect by transfecting genomic DNA from normal human cells and selecting for the ability to grow in the absence ofcholesterol. Genomic DNA from the transfected cells was used to transfect fresh M19 cells, and this procedure was repeated severaltimes, both at Dartmouth and at the University of Texas Southwestern Medical Center. Each repetition led to the elimination ofextraneous human DNA, and eventually the cells retained only a small amount of human DNA that included the complementing gene.The human DNA from these cells was detected by PCR using repetitive human Alu elements as primers. Eventually, we were able toidentify the human gene that complemented the defect in the M19 cell. Transfection of a cDNA encoded by this gene restores Site-2cleavage in M19 cells and abolishes cholesterol auxotrophy (44).

The gene that complements the defect in M19 cells was called S2P (44). This gene encodes a protein that is necessary for Site-2cleavage of SREBPs. Although circumstantial evidence suggests that S2P may be the Site-2 protease (see below), we have no directbiochemical evidence to support this contention. S2P might also be an auxiliary factor that is necessary in order for the true Site-2protease to act.

The human S2P gene encodes an extremely hydrophobic protein of 519 amino acids (Fig. 4B). Most of the protein is hydrophobic,but there are two hydrophilic stretches, one of which is cysteine-rich and the other of which contains a stretch of 23 consecutive serines.Current evidence indicates that these two hydrophilic sequences project into the lumen of the ER and the remainder of the protein isembedded in the membrane itself (N.Zelenski, R.B.Rawson, J.L.G., and M.S.B., unpublished work).

One of the hydrophobic segments of S2P contains the sequence HEIGH, which conforms to the HEXXH consensus for the activesite of zinc metalloproteases. This large and well studied family has members in every living organism from Archaea to humans (45,46). One particularly well studied example is the bacterial enzyme thermolysin (47). In these proteases, the two histidines formcovalent bonds with a zinc molecule, and the glutamic acid polarizes a water molecule so that it can make a nucleophilic attack on thepeptide bond. The two X amino acids are variable among family members, but in several cases they are isoleucine-glycine, thusconforming to the exact sequence in S2P. Mutagenesis experiments confirmed that the HEXXH sequence is required in order for S2P torestore Site-2 cleavage in M19 cells (44). When either of the histidines or the glutamic acid was changed to alanine, the protein lost theability to restore Site-2 cleavage. Computer-based searches of DNA databases revealed fragments of DNA encoding parts of proteinswith significant resemblances to S2P in Drosophila melanogaster (33% identity over 197 residues); Caenorhabditis elegans (43%identity over 199 residues); Schistosoma mansoni (27% identity over 117 residues); and Sulfolobus solfataricus (25% identity over 366residues). All of these proteins share the HEXXH consensus except S.mansoni, whose available sequence does not extend into thisregion. All of these proteins also share the overall hydrophobic character of human S2P (44).

FIG. 4. Hydropathy plots of hamster Site-1 protease (A) and human Site-2 protease (B). The residue-specific hydropathyindex was calculated over a window of 20 residues by the method of Kyte and Doolittle (60) as described (44, 51). For Site-1protease, arrows denote the three amino acids of S1P that correspond to the catalytic triad for subtilisin-like serine proteases.For Site-2 protease, the arrow denotes the sequence in S2P corresponding to the consensus HEXXH pentapeptide metalbinding site for zinc metalloproteases. The one transmembrane sequence in S1P is denoted by the horizontal bar. The serine-and cysteine-rich regions in S2P are indicated.

The mutagenesis data are consistent with the idea that S2P is indeed the Site-2 protease, but so far our multiple attempts todemonstrate in vitro protease activity for isolated S2P have failed. It is likely that these failures relate to the formidable technicaldifficulty in producing an active form of a membrane-embedded enzyme, especially one whose putative substrate is a leucine-cysteinebond that is sequestered within the membrane-spanning region of another protein (25). Getting the enzyme and substrate together in atest tube has proven extremely difficult.

If S2P is indeed a zinc metalloprotease, its hydrophobicity distinguishes it from other members of this family. Although the familyincludes membrane-bound enzymes such as matrix metalloproteases and the converting enzymes for angiotensin and endothelin, theirstructures differ fundamentally from that of S2P. In these other enzymes, the active sites are contained within hydrophilic domains thatresemble those of soluble zinc metalloproteases (46). The catalytic domain is simply attached to the membrane by a hydrophobicextension. In S2P the putative active site is contained within an otherwise hydrophobic sequence that appears to be embedded in themembrane (Fig. 4B). If S2P is a protease, it will be the first identified protease whose substrate is a membrane-spanning region ofanother protein. Proteolysis within a lipid bilayer may require

A PROTEOLYTIC PATHWAY THAT CONTROLS THE CHOLESTEROL CONTENT OF MEMBRANES, CELLS, AND BLOOD 11044

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 92: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

a hydrophobic enzyme. How such an enzyme would function in such an environment is unknown.Inasmuch as the S2P gene was cloned by complementation of the defect in M19 cells, it was important to demonstrate that this

gene was indeed mutated in this cell line. Northern gel analysis showed that the S2P mRNA was detectable in wild-type CHO cells andinallorgans studied, but it was not detectable in M19 cells (44). The S2P gene was mapped to the X chromosome (44). Although wild-type CHO-K1 cells should have two copies of this gene, Southern blotting data suggested that the cells had only one copy. In the M19cells, which were derived from CHO-K1 cells, this single copy had undergone a complex rearrangement, precluding transcription (44).

Candidate Gene for Site-1 Protease

The somatic cell genetic approach that permitted the cloning of S2P initially presented obstacles when we tried to use it forcloning S1P. The difficulty arose because of the presence of only a single copy of the S2P gene in the parental CHO-K1 cells.Whenever we mutated CHO cells and selected for cholesterol auxotrophy, we always isolated cells with mutations in S2P. We reasonedthat this was because of the high likelihood of obtaining a mutation in a single-copy gene as compared with the low likelihood ofobtaining simultaneous mutations in two copies of a gene, as was presumably the case for the S1P gene.

To circumvent this problem, we transfected CHO-K1 cells with an expressible cDNA encoding S2P and isolated a permanent cellline that contains multiple copies of this cDNA, thereby reducing the likelihood of obtaining S2P-deficient mutants (48). Aftermutagenesis, several approaches were used to isolate cells that were deficient in S1P (48). In the most successful approach, we firstattempted to enrich for mutants that were haploinsufficient for S1P by incubating the cells with LDL that had incorporated a fluorescentcholesteryl ester, pyrene-methyl cholesteryl oleate (PMCA-oleate). We reasoned that cells with only a single copy of S1P wouldproduce fewer LDL receptors because they would have lower amounts of nuclear SREBPs. Cells that were incubated with fluorescentLDL were separated by a fluorescence-activated cell sorter, and the cells with the lowest uptake were selected.

The sorted cells were subjected to a second round of mutagenesis in an attempt to inactivate the single remaining copy of the S1Pgene (48). The cells then were selected for complete cholesterol auxotrophy by using a modification of the amphotericin resistanceapproach originally developed by Limanek et al. (49). In this procedure, cells are incubated briefly in a low concentration of LDL asthe sole source of cholesterol. Cells that have normal SREBP activity will maintain their cholesterol levels as a result of enhancedcholesterol synthesis and uptake of LDL through LDL receptors. Cells with blocks in SREBP processing cannot obtain cholesterolfrom either of these sources, and they therefore become depleted in cholesterol. The cells then are treated with amphotericin, a polyeneantibiotic that disrupts plasma membranes by forming complexes with cholesterol (50). Whereas wild-type cells are killed byamphotericin, cholesterol-deficient cells are resistant. After this selection, the cholesterol auxotrophs are rescued by addition of amixture of cholesterol (dissolved in ethanol), small amounts of mevalonate to supply nonsterol products, and oleate to counteract theanticipated block in synthesis of unsaturated fatty acids (48).

The two-step mutagenesis approach described above and a modified one-step version of this approach yielded several cell linesthat were auxotrophic for cholesterol because they failed to cleave SREBPs at Site-1. Cell fusion studies showed that these defects wererecessive (48). We then used these cells as recipients in a transient transfection protocol designed to clone the defective gene. As areporter in these assays, we designed a vector that encodes a fusion protein whose secretion from cells depends on cleavage by S1P.The fusion protein consists of human placental alkaline phosphatase (PLAP) joined to the COOH-terminal half of SREBP-2 (51)(Fig. 5). PLAP is a membrane-bound enzyme that is normally translocated to the plasma membrane with its catalytic domain facing theextracellular space. It is anchored to the membrane by a COOH-terminal glycophospholipid anchor. The PLAP/BP2 fusion proteinbegins with the signal sequence of alkaline phosphatase followed by the catalytic domain. The PLAP is truncated to eliminate itsCOOH-terminal membrane anchor, and the truncated PLAP is fused to the luminal loop of SREBP-2 just to the NH2-terminal side ofthe RSVL recognition sequence for S1P.

When the PLAP/BP2 fusion protein is expressed in wild-type cells, the catalytic domain is translocated into the ER lumen byvirtue of the PLAP signal sequence. The NH2-terminal end of PLAP is freed from its membrane attachment by signal peptidase. TheCOOH-terminal end remains attached to the membrane by virtue of its connection to the COOH-terminal half of SREBP-2. Cleavageby S1P releases the catalytic domain into the lumen and allows it to be secreted into the medium where its activity can be measured bya sensitive chemiluminescence assay (51).

FIG. 5. Proteolytic processing and secretion of the PLAP/BP2 fusion protein used for the complementation cloning of S1P.The details of the construction of the plasmid encoding this fusion protein are described in ref. 51. In brief, the plasmid wasgenerated by fusing the sequence encoding the signal peptide and soluble catalytic domain of human placental alkalinephosphatase (amino acids 1–506) with the sequence encoding amino acids 513–1,141 of human SREBP-2. Secretion of thecatalytic domain of PLAP requires cleavage by signal peptidase and Site-1 protease. [Figure reproduced with permissionfrom ref. 51 (Copyright 1998, Cell Press).])

Validation experiments showed that wild-type CHO cells secreted PLAP into the medium when transfected with the cDNAencoding the PLAP/BP2 fusion protein (51). Secretion required cotransfection with a vector encoding SCAP, apparently because theendogenous SCAP was not sufficient to yield high-level cleavage of the protein. Secretion was suppressed by sterols, and it also wasabolished when the arginine of the RSVL sequence was changed to alanine. All of these findings strongly suggested that secretion ofPLAP required S1P. This was confirmed when we produced the PLAP/BP2 fusion protein in the mutant SRD-12B cells that lack S1Pactivity.

A PROTEOLYTIC PATHWAY THAT CONTROLS THE CHOLESTEROL CONTENT OF MEMBRANES, CELLS, AND BLOOD 11045

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 93: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

These cells were unable to secrete PLAP even when they were cotransfected with the SCAP-producing vector.To clone the S1P gene, we transiently transfected the mutant SRD-12B cells with the PLAP/BP2 expression vector, a plasmid

encoding SCAP, and pools of cDNAs from an expression library derived from CHO cells that produce S1P (51). To control fortransfection efficiency, we included a vector encoding β-galactosidase driven by the cytomegalovirus promoter. After transfection, themedium was assayed for PLAP activity, and the cells were assayed for β-galactosidase. We tested 300 pools of 1,000 cDNAs per pool,and identified two pools that were able to restore the secretion of PLAP in the SRD-12B cells. Subdivision of these positive poolseventually led to the purification of a single positive cDNA.

The positive cDNA encoded a protein of 1,052 amino acids whose sequence hadallof the properties expected for an enzyme thatcleaves the luminal RSVL sequence at Site-1 of SREBPs (51). We therefore named this protein S1P. The protein begins with ahydrophobic stretch with the typical properties of a signal sequence, indicating that it is translocated into the ER lumen (Fig. 4A). Thesignal sequence is followed by domain that identifies it as a member of the large family of subtilisin-related serine proteases. This isfollowed by a COOH-terminal extension that also is predicted to lie within the lumen, followed by a hydrophobic putativetransmembrane domain and a short sequence that is predicted to lie on the cytoplasmic side of the membrane. This COOH-terminal tailhas a strikingly basic character.

Subtilisin-related enzymes, or subtilases, are serine proteases that contain a catalytic site with the classic triad of serine, asparticacid, and histidine residues as well as a remote asparagine that contributes to a so-called oxyanion hole (52). Although they share thecatalytic triad with the other large family of serine proteases, the trypsin-like enzymes, the subtilases are believed to have evolvedindependently. Members of the subtilisin family are found inallliving cells from bacteria to humans. In mammals, the previouslycharacterized members of this family consist of the prohormone convertases, of which furin is the prototype. These enzymes functionwithin the lumen of organelles in the secretory pathway, where they cleave membrane-bound or secretory proteins (such as the insulinpro-receptor, pro-von Willebrand factor, and proopiomelanocortin) before their transport to the cell surface or secretion from the cell(53, 54). All of the mammalian prohormone convertases cleave after basic residues, usually after dibasic sequences, and most of themalso require a basic residue at the P4 site. The classic recognition sequence is RX(R/K)R (54). Prokaryotic members of this family,typified by Savinase from Bacillus lentus, cleave after hydrophobic residues without a requirement for any basic residue (55). Thesequence of the catalytic domain of mammalian S1P more closely resembles that of bacterial Savinase than that of mammaliansubtilisins. This observation is consistent with the predicted ability of S1P to cleave after a hydrophobic residue: i.e., the leucine of theRSVL sequence of SREBPs (24).

The sequence of human S1P was first reported by a Japanese group who sequenced random cDNAs from a human myeloid celllibrary (56). By virtue of its DNA sequence, the encoded protein was recognized as a member of the subtilisin family, and the catalytictriad residues were predicted. However, the putative enzyme was not assayed, and nothing was known of its physiologic function. Thehamster S1P that we cloned by complementation is 97% identical to the human sequence (51). Using reverse transcriptase-PCR anddegenerate oligonucleotides corresponding to the catalytic-site residues of bacterial subtilisin, Seidah et al. (57) recently cloned acDNA, designated SKI-1, from mouse and rat cells whose amino acid sequences are 97% identical to those of hamster and human S1P.SKI-1 thus appears to be the murid ortholog of hamster and human S1P.

Northern blotting showed that the S1P mRNA is produced in wild-type CHO cells and in all 15 human tissues that were examined.The mRNA was not detectable in the mutant SRD-12B cells. Genomic blots showed that these cells contain one copy of a rearrangedS1P gene and a second copy that has a normal restriction pattern but is presumably mutated not to produce detectable mRNA (51).

When we introduced an expression vector encoding S1P into SRD-12 B cells, we restored the ability of these cells to cleaveSREBP-1 and SREBP-2 at Site-1 in a sterol-regulated manner (51). The cells were now able to synthesize their own cholesterol, and allof their auxotrophies were abolished. Transfected S1P could not restore any of these functions when we replaced any one of the threeresidues that were predicted to form the catalytic triad, further supporting the notion that this protein is indeed a serine protease (51).This conclusion was supported by the finding of Seidah et al. (57), who showed that the culture medium from cells overexpressing S1P(or SKI-1) could cleave pro-brain-derived neurotrophic factor after the threonine of an RGLTS sequence.

Cell fractionation experiments confirmed that S1P is an intrinsic membrane protein (51). The protein was shown to contain N-linked carbohydrates that remained in the endoglycosidase H-sensitive form, suggesting that the protein did not reach the medial-Golgiapparatus (51). Seidah et al. (57) used immunofluorescence techniques to study the distribution of S1P (or SKI-1) in cells stablyoverexpressing the protein. They found the protein in structures that resembled the ER, the Golgi complex, and small vesicles. Whetherthis reflects the distribution of the endogenous native protein remains unknown. Like other members of the subtilisin family, S1P ispredicted to have an NH2-terminal propeptide that must be cleaved in order for it to form an active enzyme. The site of this cleavageand its mechanism remain to be determined.

Unresolved Questions

From the standpoint of physiologic regulation, the crucial unresolved questions relate to the requirement for SCAP in the S1Pcleavage reaction and the mechanism by which SCAP activity is abolished by sterols. All of the known members of the subtilisinfamily function independently, and they do not require a membrane-bound cofactor like SCAP. Does SCAP play a role in the directrecognition of SREBP by S1P? Or does SCAP play a more indirect role, perhaps by transporting SREBPs to the places in the cellwhere the active form of S1P resides?

Some evidence in favor of the latter mechanism has come from a study of the carbohydrate composition of SCAP. When CHOcells were grown in the presence of sterols and SCAP activity was suppressed, the N-linked carbohydrates of SCAP remained in theendoglycosidase H-sensitive form, suggesting that SCAP remained in the ER (37). However, after cells were switched to sterol-depleted medium and cleavage of SREBPs was inaugurated, the carbohydrates of SCAP were converted to the endoglycosidase H-resistant form. The latter observation indicates that SCAP had reached the medial-Golgi complex (37), yet our preliminary cellfractionation experiments show that the bulk of the endoglysidase H-resistant protein was still in the ER. We interpret these data toindicate that, in sterol-depleted cells, SCAP cycles from the ER to the Golgi and back again. Inasmuch as SCAP is in a complex withfull-length SREBP, these data suggest that SCAP may escort SREBP to some post-ER compartment where cleavage takes place. Whensterols are added to cells, SCAP remains in the ER, presumably in a complex with SREBP. This may prevent SREBP from reaching theorganelle that contains active S1P, thereby precluding cleavage. This hypothesis should be testable now that SCAP and S1P have beenidentified.

A PROTEOLYTIC PATHWAY THAT CONTROLS THE CHOLESTEROL CONTENT OF MEMBRANES, CELLS, AND BLOOD 11046

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 94: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

A second unresolved question relates to potential roles of S2P and S1P in proteolytic processing of other proteins in addition toSREBPs. As noted above, hydrophobic proteins that resemble S2P, including the putative zinc-binding site, are found as far back asArchaea. This suggests that S2P may play more general housekeeping roles in addition to processing SREBPs.

S1P also may play a more general role in proteolytic cleavage. S1P is the first vertebrate subtilisin whose sequence more closelyresembles the bacterial members of this family as compared with the mammalian members. This finding is consistent with theobservation that S1P cleaves SREBP after a hydrophobic leucine residue rather than after a basic residue. S1P also appears to act in apre-Golgi compartment, which differs from the prohormone convertases, which generally act in the Golgi or in post-Golgicompartments (53, 54). The requirement for SCAP suggests that the activity of S1P may be restricted to SREBPs because no otherproteins are known to require SCAP for cleavage. Moreover, cells that lack S1P grow normally as long as they are supplied with theend-products of the SREBP pathway (48). On the other hand, the finding that S1P (or SKI-1) can cleave pro-brain-derived neurotrophicfactor when overexpressed in intact cells or in vitro raises the possibility that the protease may have broader actions. This argument isrendered less persuasive by the observation that the site in pro-brain-derived neurotrophic factor that is cleaved by S1P does notcorrespond to the major site of physiologic pro-brain-derived neurotrophic factor processing in vivo (58, 59).

Clearly, the intense study of S1P and S2P is only beginning. Given the rich scientific experience with other proteases, all of theunresolved questions about these two reactions will likely be answered in the near future. These answers should markedly advance ourknowledge of cholesterol homeostasis.

This work was supported by research grants from the National Institutes of Health (HL20948) and the Perot Family Foundation.1. Devaux, P.F. (1993) Curr. Opin. Struct. Biol. 3, 489–494.2. Simons, K. & Ikonen, E. (1997) Nature (London) 387, 569–572.3. Anderson, R.G.W. (1998) Annu. Rev. Biochem. 67, 199–225.4. Brown, M.S. & Goldstein, J.L. (1997) Cell 89, 331–340.5. Hua, X., Wu, J., Goldstein, J.L., Brown, M.S. & Hobbs, H.H. (1995) Genomics 25, 667–673.6. Yokoyama, C, Wang, X., Briggs, M.R., Admon, A., Wu, J., Hua, X., Goldstein, J.L. & Brown, M.S. (1993) Cell 75, 187–197.7. Shimano, H., Horton, J.D., Shimomura, I., Hammer, R.E., Brown, M.S. & Goldstein, J.L. (1997) J. Clin. Invest. 99, 846–854.8. Shimomura, I., Shimano, H., Horton, J.D., Goldstein, J.L. & Brown, M.S. (1997) J. Clin. Invest. 99, 838–845.9. Tontonoz, P., Kim, J.B., Graves, R.A. & Spiegelman, B.M. (1993) Mol. Cell Biol. 13, 4753–4759.10. Hua, X., Yokoyama, C., Wu, J., Briggs, M.R., Brown, M.S., Goldstein, J.L. & Wang, X. (1993) Proc. Natl. Acad. Sci. USA 90, 11603–11607.11. Parraga, A, Bellsolell, L., Ferre-D’Amare, A.R. & Burley, S.K. (1998) Structure (London) 6, 661–672.12. Näär, A.M., Beaurang, P.A., Robinson, K.M., Oliner, J.D., Avizonis, D., Scheek, S., Zwicker, J., Kadonaga, J.T. & Tjian, R. (1998) Genes Dev. 12,

3020–3031.13. Pai, J., Guryev, O., Brown, M.S. & Goldstein, J.L. (1998) J. Biol. Chem. 273, 26138–26148.14. Kim, J.B., Spotts, G.D., Halvorsen, Y.-D., Shih, H.-M, Ellenberger, T., Towle, H.C. & Spiegelman, B.M. (1995) Mol. Cell. Biol. 15, 2582–2588.15. Smith, J.R., Osborne, T.F., Brown, M.S., Goldstein, J.L. & Gil, G. (1988) J. Biol. Chem. 263, 18480–18487.16. Smith, J.R., Osborne, T.F., Goldstein, J.L. & Brown, M.S. (1990) J. Biol. Chem. 265, 2306–2310.17. Magana, M.M. & Osborne, T.F. (1996) J. Biol. Chem. 271, 32689–32694.18. Sakai, J., Duncan, E.A.. Rawson, R.B., Hua, X., Brown, M.S. & Goldstein, J.L. (1996) Cell 85, 1037–1046.19. Horton, J.D. & Shimomura, I. (1999) Curr. Opin. Lipidol. 10, 143–150.20. Edwards, P.A. & Ericsson, J. (1998) Curr. Opin. Lipidol. 9, 433–440.21. Shimomura, I., Shimano, H., Korn, B.S., Bashmakov, Y. & Horton, J.D. (1998) J. Biol. Chem. 273, 35299–35306.22. Wang, X., Sato, R., Brown, M.S., Hua, X. & Goldstein, J.L. (1994) Cell 77, 53–62.23. Horton, J.D., Shimomura, I., Brown, M.S., Hammer, R.E., Goldstein, J.L. & Shimano, H. (1998) J. Clin. Invest. 101, 2331–2339.24. Duncan, E.A., Brown, M.S., Goldstein, J.L. & Sakai, J. (1997) J. Biol. Chem. 272, 12778–12785.25. Duncan, E.A., Davé, U.P., Sakai, J., Goldstein, J.L. & Brown, M.S. (1998) J. Biol. Chem. 273, 17801–17809.26. Hua, X., Nohturfft, A, Goldstein, J.L. & Brown, M.S. (1996) Cell 87, 415–426.27. Nohturfft, A., Brown, M.S. & Goldstein, J.L. (1998) J. Biol. Chem. 273, 17243–17250.28. Neer, E.J., Schmidt, C.J., Nambudripad, R. & Smith, T.F. (1994) Nature (London) 371, 297–300.29. Wall, M. A, Coleman, D.E., Lee, E., Iniguez-Lluhi, J.A., Posner, B.A., Gilman, A.G. & Sprang, S.R. (1995) Cell 83, 1047–1058.30. Lambright, D.G., Sondek, J., Bohm, A., Skiba, N.P., Hamm, H.E. & Sigler, P.B. (1996) Nature (London) 379, 311–319.31. Sakai, J., Nohturfft, A., Cheng, D., Ho, Y.K., Brown, M.S. & Goldstein, J.L. (1997) J. Biol. Chem. 272, 20213–20221.32. Sakai, J., Nohturfft, A., Goldstein, J.L. & Brown, M.S. (1998) J. Biol. Chem. 273, 5785–5793.33. Metherall, J.E., Ridgway, N.D., Dawson, P.A., Goldstein, J.L. & Brown, M.S. (1991) J. Biol. Chem. 266, 12734–12740.34. Yang, J., Sato, R., Goldstein, J.L. & Brown, M.S. (1994) Gene Dev. 8, 1910–1919.35. Yang, J., Brown, M.S., Ho, Y.K. & Goldstein, J.L. (1995) J. Biol. Chem. 270, 12152–12161.36. Nohturfft, A., Hua, X., Brown, M.S. & Goldstein, J.L. (1996) Proc. Natl. Acad. Sci. USA 93, 13709–13714.37. Nohturfft, A., Brown, M.S. & Goldstein, J.L. (1998) Proc. Natl. Acad. Sci. USA 95, 12848–12853.38. Gil, G., Faust, J.R., Chin, D.J., Goldstein, J.L. & Brown, M.S. (1985) Cell 41, 249–258.39. Kumagai, H., Chun, K.T. & Simoni, R.D. (1995) J. Biol. Chem. 270, 19107–19113.40. Loftus, S. K, Morris, J. A, Carstea, E.D., Gu, J.Z., Cummings, C., Brown, A., Ellison, J., Ohno, K., Rosenfeld, M.A., Tagle, D.A, et al. (1997)

Science 277, 232–235.41. Tabin, C.J. & McMahon, A.P. (1997) Trends Cell Biol. 7, 442–446.42. Porter, J.A., Young, K.E. & Beachy, P.A. (1996) Science 274, 255–259.43. Hasan, M.T., Chang, C.C.Y. & Chang, T.Y. (1994) Somatic Cell Mol. Genet. 20, 183–194.44. Rawson, R.B., Zelenski, N.G., Nijhawan, D., Ye, J., Sakai, J., Hasan, M.T., Chang, T.-Y., Brown, M.S. & Goldstein, J.L. (1997) Mol. Cell 1, 47–57.45. Hooper, N.M. (1994) FEBS Lett. 354, 1–6.46. Rawlings, N.D. & Barrett, A.J. (1995) Methods Enzymol. 248, 183–228.47. Holmes, M.A. & Matthews, B.W. (1982) J. Mol. Biol. 160, 623–639.48. Rawson, R.B., Cheng, D., Brown, M.S. & Goldstein, J.L. (1998) J. Biol. Chem. 273, 28261–28269.49. Limanek, J.S., Chin, J. & Chang, T.Y. (1978) Proc. Natl. Acad. Sci. USA 75, 5452–5456.50. DeKruijff, B., Gerritsen, W.J., Oerlemans, A., Demel, R.A. & Van Deenen, L.L.M. (1974) Biochim. Biophys. Acta 339, 30–43.51. Sakai, J., Rawson, R.B., Espenshade, P.J., Cheng, D., Seegmiller, A.C., Goldstein, J.L. & Brown, M.S. (1998) Mol. Cell 2, 505–514.52. Siezen, R.J. & Leunissen, J.A.M. (1997) Protein Sci. 6, 501–523.53. Seidah, N.G. & Chretien, M. (1994) Methods Enzymol. 244, 175–188.54. Nakayama, K. (1997) Biochem. J. 327, 625–635.55. Sørensen, S.B., Bech, L.M., Meldal, M. & Breddam, K. (1993) Biochemistry 32, 8994–8999.

A PROTEOLYTIC PATHWAY THAT CONTROLS THE CHOLESTEROL CONTENT OF MEMBRANES, CELLS, AND BLOOD 11047

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 95: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

56. Nagase, T., Miyajima, N., Tanaka, A., Sazuka, T., Seki, N., Sato, S., Tabata, S., Ishikawa. K.-i., Kawarabayasi, Y., Kotani, H. & Nomura, N. (1995)DNA Res. 2, 37–43.

57. Seidah, N.G., Mowla, S.J., Hamelin, J., Mamarbachi, A.M., Benjannet, S., Touré, B.B., Basak, A., Munzer, J.S., Marcinkiewicz, J., Zhong, M., et al.(1999) Proc. Natl. Acad. Sci. USA 96, 1321–1326.

58. Leibrock, J., Lottspeich, F., Hohn, A., Hofer, M., Hengerer, B., Masiakowski, P., Thoenen, H. & Barde, Y.-A. (1989) Nature (London) 341, 149–152.59. Rosenfeld, R.D., Zeni, L., Haniu, M., Talvenheimo, J., Radka, S.F., Bennett, L., Miller, J.A. & Welcher, A.A. (1995) Protein Expression Purif. 6,

465–471.60. Kyte, J. & Doolittle, R.F. (1982) J. Mol. Biol. 157, 105–132.

A PROTEOLYTIC PATHWAY THAT CONTROLS THE CHOLESTEROL CONTENT OF MEMBRANES, CELLS, AND BLOOD 11048

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 96: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Cellular mechanisms of -amyloid production and secretion

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation,” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

SUKANTO SINHA* AND IVAN LIEBERBURG

Elan Pharmaceuticals, South San Francisco, CA 94080ABSTRACT The major constituent of senile plaques in Alzheimer’s disease is a 42-aa peptide, referred to as β-amyloid

(Aβ). Aβ is generated from a family of differentially spliced, type-1 transmembrane domain (TM)-containing proteins, calledAPP, by endoproteolytic processing. The major, relatively ubiquitous pathway of APP metabolism in cell culture involvescleavage by α-secretase, which cleaves within the Aβ sequence, thus precluding Aβ formation and deposition. An alternatesecretory pathway, enriched in neurons and brain, leads to cleavage of APP at the N terminus of the Aβ peptide by β-secretase,thus generating a cell-associated β-C-terminal fragment (β-CTF). A pathogenic mutation at codons 670/671 in APP (APP“Swedish”) leads to enhanced cleavage at the β-secretase scissile bond and increased Aβ formation. An inhibitor of vacuolarATPases, bafilomycin, selectively inhibits the action of β-secretase in cell culture, suggesting a requirement for an acidicintracellular compartment for effective β-secretase cleavage of APP. β-CTF is cleaved in the TM domain by γ-secretase(s),generating both Aβ 1–40 (90%) and Aβ 1–42 (10%). Pathogenic mutations in APP at codon 717 (APP “London”) lead to anincreased proportion of Aβ 1–42 being produced and secreted. Missense mutations in PS-1, localized to chromosome 14, arepathogenic in the majority of familial Alzheimer’s pedigrees. These mutations also lead to increased production of Aβ 1–42 overAβ 1–40. Knockout of PS-1 in transgenic animals leads to significant inhibition of production of both Aβ 1–40 and Aβ 1–42 in primary cultures, indicating that PS-1 expression is important for γ-secretase cleavages. Peptide aldehyde inhibitors that blockAβ production by inhibiting γ-secretase cleavage of β-CTF have been discovered.

Aβ Is Derived from APP. Alzheimer’s disease is a wide-spread, neurodegenerative, dementia-inducing disorder of the elderlythat has been estimated to affect more than 4 million people in the United States alone. The disease is characterized by synaptic lossand neuronal death in the cerebral cortex and the hippocampus, with the presence of extensive extracellular amyloid plaques andintracellular neurofibrillary tangles (1). The pathology of Alzheimer’s disease has been studied extensively for the last 20 years, but itwas not until about 15 years ago that the first molecular handle in understanding this complex degenerative disease was obtained, whenthe protein sequence of the extracellular amyloid was determined (2). The cloning of APP, achieved in 1987 (3), established that thefibrillar, �40-aa-long amyloid peptide deposited as the major constituent of both senile and cerebrovascular plaques is derived from atype-1 TM protein. The parsimonious hypothesis, immediately arising as a consequence of the schematic shown in Fig. 1, was that twoseparate endoproteolytic events released the smaller Aβ peptide from its precursor.

APP was also found to be expressed in a variety of tissues as a family of differentially spliced forms, the transcripts ranging inpredicted size from 695 to 770 aa. The two longer forms, known as APP751 and APP770, contained a 56-aa domain with homology tothe Kunitz family of serine protease inhibitors (4). APP695, the splicing variant lacking the Kunitz domain, was preferentiallyexpressed in neuronal tissue, leading to the speculation that the production of Aβ from APP could be regulated by a protease that isinhibited by this domain.

The demonstration that a secreted, soluble form of APP was functionally identical to a previously isolated serine protease inhibitorcalled protease nexin II (5), together with the finding that the Kunitz domain showed restricted inhibitory activity toward a number ofserine proteases (6), strengthened the hypothesis that the soluble ectodomain of APP functions as a circulating protease inhibitor.

Secreted APP (sAPP) Production: α-Secretase. Transfection of the various forms of APP into mammalian cells showed thatnewly synthesized APP, N-glycosylated in the endoplasmic reticulum, matures in the secretory pathway by the addition of O-glycosylresidues and tyrosine sulfation in the trans-Golgi network (7); cellular turnover of full-length, membrane-bound, mature APP isaccompanied by the release in the conditioned medium (CM) of the soluble ectodomain of the protein and the appearance of a truncatedcell-associated CTF (8). The soluble sAPP is detected, not only in the CM of transfected cells, but is also found in plasma andcerebrospinal fluid, suggesting a conserved metabolic pathway. Direct sequencing of the CTF obtained from APP-transfected cellsshowed that the endoproteolytic cleavage generating the sAPP and the corresponding CTF occurs primarily by cleavage between aminoacids 16 and 17 of the Aβ sequence (9), i.e., inside the Aβ sequence. Analysis of the metabolism of various site-specific mutants ofAPP led to the conclusion that the cleavage site of this unidentified cellular enzyme, named α-secretase, was relatively nonspecific,with distance from the TM being a more important parameter than the actual identity of amino acids at the cleavage site(s) (10). Theubiquity of this pathway, which by definition could not produce Aβ, led to the proposition that the “normal” cellular metabolism ofAPP precludes the formation of Aβ. The corollary, that the production of Aβ is caused by abnormal or “aberrant” cleavages in the FL-APP molecule, came to be accepted as well.

Further, it was recognized that α-secretase activity could be stimulated in cells by using phorbol esters, leading to the activation ofprotein kinase C (11). The demonstration that muscarinic agents mimic this effect (12) indicated that stimulated α-cleavage could belinked in neuronal cells to the activity of cholinergic agents. This demonstration lent more credence to the hypothesis that α-secretoryprocessing of APP is a “good” pathway that is diminished in brain with Alzhei-

*To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviations: Aβ, β-amyloid; TM, transmembrane; CTF, C-terminal fragment; sAPP, secreted APP; Wt, wild type; CHO, Chinese

hamster ovary; CM, conditioned medium.

CELLULAR MECHANISMS OF Β-AMYLOID PRODUCTION AND SECRETION 11049

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

β

Page 97: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

mer’s disease, perhaps as a consequence of loss of cholinergic stimulation. The “uncleaved” APP could then be cleaved by aberrantproteolytic events, perhaps mediated by lysosomal enzymes, generating Aβ.

FIG. 1. Aβ is generated from precursor protein, APP. N, N terminus; C, C terminus.

sAPP Production: β-Secretase. The first piece of evidence that Aβ production may not be aberrant after all was provided by theobservation that both APP-transfected HEK293 cells (13) as well as fetal neuronal cultures (14) constitutively release Aβ 1–40 into theculture medium, i.e., Aβ generation and extracellular release are by-products of normal cellular metabolism of APP. This conclusion,dramatic at the time, has since been confirmed by many investigators and has come to be widely accepted.

Shortly thereafter, it was shown that a truncated form of sAPP was released from HEK293 cells transfected with APP, as well asfrom primary fetal human neuronal cultures (15). Using a neoepitope-specific antibody, these investigators showed that the truncatedsAPP ended precisely at Met-596, a marker of specific endoproteolytic cleavage immediately N-terminal to the Aβ sequence. This β-sAPP made up a much larger proportion of total sAPP in the neuronal culture CM than in the HEK293 cell CM, suggesting that thisalternative secretory cleavage, by the so-called β-secretase, was more prominent in cells derived from the central nervous system.

The consequences of these two pivotal observations were that it became possible to measure three key metabolites of APP (α-sAPP, β-sAPP, and Aβ) in a cellular context and especially to look for both inhibitors and potential stimulators of Aβ release underdefined conditions.

Stimulated Release of sAPP: Effect on Aβ. Phorbol esters, such as phorbol 12-myristate 13-acetate or phorbol dibutyrate, havebeen used widely to stimulate sAPP release in a variety of cellular systems. Early results suggested that stimulation of sAPP releasewas accompanied, reciprocally, by a decrease in Aβ release (16). However, subsequent analysis in a neuroblastoma cell line in cultureshowed that stimulated release of sAPP was not always accompanied by decreased Aβ (17). Although phorbol 12-myristate 13-acetatevirtually universally stimulates α-sAPP production, there is little, if any, effect on β-sAPP levels, and the reduction of Aβ is often onlytransient (J.Knops and S.S., unpublished observations). No effect on synthesis of APP was seen in these experiments. Thus, there is notnecessarily a mutually exclusive relationship between α-and β-secretory cleavages, a conclusion that has become more apparent asother pharmacological agents for affecting APP metabolism have become available.

Bafilomycin and β-sAPP Inhibition. A double mutation of codons 670/671 of APP, replacing the Lys-Met sequence with Asn-Leu (18) and segregating with very early-onset Alzheimer’s disease with classic pathologic hallmarks, was described in 1992.Transfection of HEK293 cells with cDNA constructs coding for the mutated protein led to a 6-fold increase in extracellularly releasedAβ (19) compared with wild-type (Wt) APP. Concurrent analysis of the sAPP species released showed that there was also a substantialincrease in the β-sAPP being released from such cells. The so-called Swedish mutation in APP thus seems to exert its pathogenic effectvia an increased production of Aβ, mediated by increased β-secretase cleavage in the mutated protein. This observation provided, notonly a mechanistic explanation for a pathogenic mutation, but also a cellular system, relevant to the underlying disease model, in whichto study pharmacological agents that can selectively inhibit the formation of Aβ.

A specific and potent inhibitor of vacuolar ATPases, bafilomycin, was shown to inhibit β-sAPP selectively, but not α-sAPP, bothfrom HEK293 cells transfected with APP Swedish mutants and from fetal neuronal cultures (20). This effect was ascribed to the knownpharmacological activity of bafilomycin, treatment with which leads to the elevation of intravesicular pH in a variety of acidicorganelles, including, but not restricted to, endosomes and lysosomes (21). The concordance of the data obtained from studies withboth the mutant APP-transfected cells and fetal neuronal cultures metabolizing endogenous Wt APP showed (i) that selective inhibitionof β-secretase cleavage results in inhibition of Aβ release and (ii) that α-sAPP release is not affected under these conditions. Further,these data provided indirect but convincing evidence that acidic intracellular conditions are most conducive to efficient β-secretaseprocessing of APP.

Like APP, a number of other membrane-bound proteins are “shed” from the cell surface, often in response to stimulation byphorbol esters (22). A pathologically important protein in this regard is pro-tumor necrosis factor-α (proTNF-α), which undergoes cell-surface proteolysis by an “α-secretase-like” enzyme to release circulating TNF. The purification and identification of the TNF-α-converting enzyme (TACE) as a membrane-bound metalloprotease (23) led to speculation that, like TACE, APP α-secretase is also amember of the adamalysin protease family. Cells deficient in TACE do not show any defect in constitutive α-cleavage of APP (24);however, no stimulated release of sAPP is evident on treatment with phorbol esters, suggesting that TACE plays a key role in regulated,but not constitutive, α-cleavage of APP. Metalloprotease inhibitors directed toward such proteases inhibit α-sAPP release from Chinesehamster ovary (CHO) cells in a dose-dependent manner (25), but such treatments have no significant effect on either β-sAPP or Aβ(E.Goldbach, S. Suomensaari, J.Knops, and S.S., unpublished observations).

The results of the phorbol ester, bafilomycin, and metalloprotease inhibitor studies strongly suggest that a simple reciprocalrelationship does not exist between α- and β-cleavage or between sAPP production and Aβ release. It seems most likely that α-secretaseand β-secretase are cellularly segregated, mechanistically distinct enzymes, and it is the direct action of the latter that correlates mostwith Aβ release.

Pathogenic Mutations in APP. Three separate missense mutations in APP, occurring at codon 717 (London mutations), alsocause early-onset Alzheimer’s (26) but do so by a mechanism very different from that of the Swedish mutation.

After β-secretase cleavage, the C terminus of the β-peptide has to be generated by a further proteolytic event, which takes place inthe TM domain of APP. In keeping with the imaginative and sequential nomenclature for the enzymes postulated to be involved incellular APP proteolysis and Aβ generation, the enzyme cleaving in the TM domain to generate the C terminus of the Aβ peptide hasbeen named γ-secretase.

It has been shown that most of the Aβ released from both cell lines derived from tissues other than those from the central nervoussystem and from neuronal cells terminates at residue 40. However, a small proportion (5–10%) extends to residue 42 (27). It has beenpostulated that the major pathologic culprit in Alzheimer’s disease is this subpopulation of Aβ, because this longer, more aggregation-prone species deposits preferentially in both sporadic and familial Alzheimer’s disease brains. Careful measurement of the Aβ releasedfrom cells transfected with the various London mutations revealed that although

CELLULAR MECHANISMS OF Β-AMYLOID PRODUCTION AND SECRETION 11050

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 98: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

total Aβ released was unaffected, the proportion of Aβ 1–42 increased by 50–90%, i.e., from about 10% of the total to about 20% (28).The London mutations thus shift the balance of γ-secretase cleavage slightly toward the 42 over the 40 cleavage site, which issufficient, apparently, to cause disease. These observations have led to the proposition that there are at least two separate γ-secretasesfor the Aβ40 and Aβ42 sites. In the absence of definitive information, this subject lies at the heart of a current debate.

Both the London and the Swedish mutations have been used to develop transgenic models of the pathology seen in Alzheimer’sdisease. The so-called PDGF promoter APP mouse was developed with the Val717Phe mutation (29). As the animals age, Aβ 1–42deposits preferentially in the hippocampus and the cortex, mirroring the pattern seen in Alzheimer’s disease. Like Alzheimer’s disease,no senile plaques are seen in the cerebellum, in spite of expression of the transgene in this region. In addition to plaques, one mayobserve neuritic dystrophy, microglial activation, and astrocytic activation (30), following closely on the heels of the amyloiddeposition. The major hallmarks of the disease are thus preserved in these models, which will be invaluable in evaluating the efficacyof compounds targeting the production or aggregation of the Aβ peptide.

Presenilins and Alzheimer’s Disease. APP mutations, as illuminating as they have been in both the causative role of Aβ inAlzheimer’s disease and in underscoring the importance of both β- and γ-secretase-mediated cleavages for Aβ generation and release,are relatively rare and confined to only a few familial pedigrees. A much larger number of familial Alzheimer’s disease pedigreescluster to chromosome 14, and the product of this gene, S182, was revealed to be a multiple-membrane-spanning protein (31)imaginatively called presenilin-1. At least 37 separate missense mutations have been documented in this protein. A related gene,STML2, on chromosome 1, the protein product of which is called presenilin-2 (32), has also been shown to have missense mutationsthat cause Alzheimer’s disease, and two of these mutations have been documented thus far. The pathology seen in the brains of thepedigrees examined invariably show dramatic deposition of amyloid, virtually all of which are in the 1–42 form (33). Disease causedby the PS-1 mutations is aggressive, early-onset, and fully penetrant.

Cotransfection of presenilin mutants along with APP revealed the same phenomenon seen with the London mutations, i.e., thepresenilin mutants invariably increase the proportion of x–42 forms between 50–100% over that seen with Wt presenilins (34). Nosignificant effects on sAPP release or on the levels of total Aβ released are seen in these experiments. Cotransfection of APP carryingone of the London mutations along with a mutant PS-1 leads to an additive effect on the increased Aβ40/42 ratio.

Thus, the majority of familial Alzheimer’s mutants cluster to a gene, the protein product of which somehow modulates the γ-secretase cleavage with the same consequences resulting from London mutations. The homology of the PS-1 to sel-12, aCaenorhabditis elegans gene that facilitates signaling by Notch (35), has led to speculation about cellular mechanisms that mightunderlie the increased γ-secretase cleavage at residue 42.

The most telling data have emerged from an attempt to create PS-1 –/– animals. The homozygous animals die in utero with severedevelopmental abnormalities reminiscent of Notch –/– animals. However, the introduction, via viral vectors, of Wt and mutant APPsinto cortical cultures produced from these embryos (36) showed that, although normal APP maturation and sAPP release wereunaffected, the cells were deficient in γ-secretase cleavage of the α- and β-CTFs generated by the action of α- and β-secretases; both Aβand p3 (the α-CTF-derived γ-secretase cleavage product) ending at residue 40 or 42 decreased by 80%, with a corresponding increasein the ambient levels of the corresponding CTFs. These results strongly suggest that the expression of PS-1 is needed for the majorityof functional γ-secretase activity in vivo. Perhaps the residual production of Aβ and p3 is mediated by PS-2.

Peptide Aldehyde Inhibitors of Aβ Release. It has been known for some time that, in cell lines derived from peripheral tissues,such as HEK293, much of the full-length mature APP is degraded via a lysosomal pathway. The application of lysosomotropic agents,such as chloroquine and NH4Cl, or cysteine protease inhibitors, such as E-64 and leupeptin, led to enhanced recovery of full-lengthmembrane-bound APP and the visualization of degradation intermediates (37). However, neither E-64 nor leupeptin have any effect onthe release of Aβ under such conditions, indicating that the so-called “endosomal-lysosomal” degradation pathway was probably notinvolved in the generation of Aβ. However, Z-Val-Phe-CHO, a dipeptide aldehyde originally identified as a potent inhibitor of anumber of intracellular cysteine proteases, such as cathepsin B, cathepsin L, and calpain (38), was shown to inhibit Aβ release at lowmicromolar levels in a dose-dependent manner (39). A number of other dipeptide aldehydes, with ED50 values varying between 1 and25 µM, were also shown to be active as inhibitors of cellular Aβ release in HEK293 cells transfected with either Wt or Swedish APP.

Analysis of the cellular pattern of metabolites indicated that the release of both p3 and Aβ was being inhibited by suchcompounds, with concomitant increases in the levels of the corresponding CTFs. The mechanism of the action of such compounds istherefore via inhibition of γ-secretase cleavage, either as direct inhibitors of the enzyme or through indirect effects on events critical toγ-secretase cleavage. As shown in Table 1, some closely related compounds in this series have differential effects on their relativepotency toward Aβx-40 vs. Aβx-42 inhibition in HEK293 cells stably transfected with the APP Swedish mutants. These effects haveled some investigators to propose that different γ-secretases are involved in the two cleavages.

However, it has been suggested that Aβ 1–40 is produced at greater proximity to the cell surface than is Aβ 1–42 (40); if thissuggestion is accurate, variations in intracellular compound levels in different intracellular compartments may explain the differentialinhibitory susceptibilities with some of these compounds.

Table 1. Effect of dipeptide aldehydes on cellular A releaseED50, µM

Compound Aβ x–40 Aβ x–42Z-Val-Phe-CHO 15.5 67.42-Napthyl-Val-Phe-CHO 2.6 2.7Z-Phe-Val-CHO Not inhibitoryZ-Leu-Phe-CHO 5.0 –

Although the peptide aldehydes seem to point to the role of an intracellular cysteine or serine protease as pivotal to γ-secretaseprocessing of CTFs, direct evidence for such an enzyme target for these compounds is still lacking. In this regard, a recent publication(41) has put forward a quite remarkable proposition as to the possible nature of γ-secretase. In this report, the mutation of either of twoseparate TM aspartic acid residues in PS-1, Asp-257 in TM6 and Asp-385 in TM7, leads to a lowering of Aβ and increases the amountsof the α- and β-CTFs, as seen in the PS-1 –/– mice-derived neuronal cultures. The authors suggest that PS-1 may be γ-secretase, withthe two aspartic acid residues forming a catalytic system analogous to that conserved in the aspartic proteinase family. It should benoted that no discernible amino acid sequence homology exists between PS-1 and any aspartic

CELLULAR MECHANISMS OF Β-AMYLOID PRODUCTION AND SECRETION 11051

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 99: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

proteinase, even around the putative “active-site” Asp-257 and Asp-385 residues, and more direct evidence is needed in support of thisconcept.

β-Secretase: Rate-Limiting Enzyme for Aβ Production. The London mutations in APP and the missense mutations in PS-1 thatlead to Alzheimer’s disease have in common their alteration of the relative cleavage at the –40 and –42 sites in the TM domain of APP.The specificity of these γ-secretase cleavages were analyzed further by sequentially replacing amino acids 35–48 in the TM domainwith Phe (42), akin to the “Ala scan” used for other scanning mutagenesis approaches. The production of Aβ and the relative ratios ofx-40 vs. x-42 forms were then analyzed in the CM of cells transfected with these mutant forms. Although position 45 was identified asbeing critical for –42 cleavage, there was little specificity at the γ-cleavage sites; although there were alterations in the relative ratios,total Aβ formation was relatively unaffected by the scanning mutagenesis, suggesting that the precise identity of the amino acidresidues at or near the γ-cleavage sites was not critical to total cleavage.

In sharp contrast, site-directed mutagenesis at the Met-Asp cleavage site on the β-end leads to dramatic effects on Aβ production(43). Although the substitution of Leu for Met at the P1 position (akin to the Swedish mutation) leads to enhancement of Aβ formation,substitution at this site by most other amino acids leads to a suppression of Aβ release in the extracellular medium, presumably byinhibition of β-secretase cleavage. Effective β-secretase cleavage is thus a prerequisite for formation and secretion of Aβ. In the case ofsome of the mutants, the fact that shorter Aβ peptides are secreted at a lower rate may represent the effect of an alternate cleavage siteexposed as a result of conformational change in the mutated protein.

In conjunction with the results obtained with bafilomycin, it seems that β-secretase cleavage is a rate-limiting event for theformation of the “substrate” for γ-secretase. The latter enzymatic process is quite capable of turning over even the 5- to 6-fold excess β-CTFs generated in APP Swedish-transfected HEK293 cells, over that produced with Wt alone. Further, the Swedish mutation,unfortunately for the pedigree, causes disease by presenting a preferred β-cleavage site to the cellular enzyme.

β-Secretase: Isolation and Characterization. The search for enzymes that specifically cleave at the β-cleavage site in APP wasinitiated long before there was any cellular evidence for the presence of such a metabolic pathway. Although enzymes such as themetalloendopeptidase (EC 3.4.24.15) and cathepsin D were proposed to be candidate β-secretases, primarily as a result of cleavagespecificity shown by using short peptide substrates (44), neither enzyme has passed the tests of being able to cleave full-length APPspecifically, generating both the N- and C-terminal fragments. Cotransfection of these enzymes along with APP into cells such asHEK293 did not lead to the overproduction of either Aβ or β-sAPP (45).

The existence of the β-secretase pathway of APP cleavage, enriched in neuronal cells, leads to specific cleavage of APP at the Nterminus of the Aβ peptide sequence. This cleavage leads to the formation of the soluble β-sAPP, as well as the membrane-associated β-CTF, the immediate precursor to Aβ. The compilation of the cellular results obtained by studying APP processing thus suggests that atrue candidate β-secretase should have, at a minimum, the following characteristics. (i) It should specifically cleave APP at the Met-Asp site to generate the corresponding β-sAPP and β-CTF fragments, (ii) A true candidate β-secretase should show preferentialcleavage toward Swedish over Wt sequence at the cleavage site. (iii) A true candidate β-secretase should function optimally at an acidicpH. (iv) A true candidate β-secretase also would be enriched in brain and neuronal tissue but present in cell lines such as HEK293 aswell. The isolation and enzymatic characterization of a membrane-bound protease from human brain that meets these criteria (46) hasbeen made possible by using APP-based fusion proteins incorporating both Wt and Swedish sequences, as well as the development ofvery specific ELISA-based quantitative assays for measuring cleavage at the β-cleavage site(s) in these fusion proteins. Although theidentity of this enzymatic activity is not yet published, recombinant expression and cotransfection with APP would establish whethersuch an enzyme fulfills the additional cellular criteria of showing enhanced, specific cleavage in APP proteins at the β-cleavage sites.1. Selkoe, D.J. (1991) Neuron 6, 487–498.2. Glenner, G.G. & Wong, C.W. (1984) Biochem. Biophys. Res. Commun. 3, 885–890.3. Kang, J., Lemaire, H.G., Unterbeck, A., Salbaum, J.M., Masters, C.L., Grzeschik, K.H., Multhaup, G., Beyreuther, K. & Muller-Hill, B. (1987)

Nature (London) 325, 733–736.4. Ponte, P., Gonzalez-DeWhitt, P., Schilling, J., Miller, J., Hsu, D., Greenberg, B., Davis, K., Wallace, W., Lieberburg, I. & Fuller, F. (1988) Nature

(London) 331, 525–527.5. Oltersdorf, T., Fritz, L.C., Schenk, D.B., Lieberburg, I., Johnson-Wood, K.L., Beattie, E.C., Ward, P.J., Blacher, R.W., Dovey, H.F. & Sinha, S.

(1989) Nature (London) 341, 144–147.6. Sinha, S., Dovey, H.F., Seubert, P., Ward, P.J., Blacher, R.W., Blaber, M., Bradshaw, R.A., Arici, M., Mobley, W.C. & Lieberburg, I. (1990) J. Biol.

Chem. 265, 8983–8985.7. Weidemann, A., Konig, G., Bunke, D., Fischer, P., Salbaum, J.M., Masters, C.L. & Beyreuther, K. (1989) Cell 57, 115–126.8. Oltersdorf, T., Ward, P.J., Henriksson, T., Beattie, E.C., Neve, R., Lieberburg, I. & Fritz, L.C. (1990) J. Biol. Chem. 265, 4492–4497.9. Esch, F.S., Keim, P.S., Beattie, E.C., Blacher, R.W., Culwell, A.R., Oltersdorf, T., McClure, D. & Ward, P.J. (1990) Science 248, 1122–1124.10. Sisodia, S.S. (1992) Proc. Natl. Acad. Sci. USA 89, 6075–6079.11. Buxbaum, J.D., Gandy, S.E., Cicchetti, P., Ehrlich, M.E., Czernik, A.J., Fracasso, R.P., Ramabhadran, T.V., Unterbeck, A.J. & Greengard, P. (1990)

Proc. Natl. Acad. Sci. USA 87, 6003–6006.12. Nitsch, R.M., Slack, B.E., Wurtman, R.J. & Growdon, J.H. (1992) Science 258, 304–307.13. Seubert, P., Oltersdorf, T., Lee, M.G., Barbour, R., Blomquist, C., Davis, D.L., Bryant, K., Fritz, L.C., Galasko, D., Thal, L.J., et al. (1993) Nature

(London) 361, 260–263.14. Haass, C., Schlossmacher, M.G., Hung, A.Y., Vigo-Pelfrey, C., Mellon, A., Ostaszewski, B.L., Lieberburg, I., Koo, E.H., Schenk, D., Teplow, D.B.,

et al. (1992) Nature (London) 359, 322–325.15. Seubert, P., Vigo-Pelfrey, C., Esch, F., Lee, M., Dovey, H., Davis, D., Sinha, S., Schlossmacher, M., Whaley, J., Swindlehurst, C., et al. (1992)

Nature (London) 359, 325–327.16. Buxbaum, J.D., Koo, E.H. & Greengard, P. (1993) Proc. Natl. Acad. Sci. USA 90, 9195–9180.17. Dyrks, T., Monning, U., Beyreuther, K. & Turner, J. (1994) FEBS Lett. 349, 210–214.18. Mullan, M., Crawford, F., Axelman, K., Houlden, H., Lilius, L., Winblad, B. & Lannfelt, L. (1992) Nat. Genet. 1, 345–347.19. Citron, M., Oltersdorf, T., Haass, C., McConlogue, L., Hung, A.Y., Seubert, P., Vigo-Pelfrey, C., Lieberburg, I. & Selkoe, D.J. (1992) Nature

(London) 360, 672–674.20. Knops, J., Suomensaari, S., Lee, M., McConlogue, L., Seubert, P. & Sinha, S. (1995) J. Biol Chem. 270, 2419–2422.21. Yoshimori, T., Yamamoto, A., Moriyama, Y., Futai, M. & Tashiro, Y. (1991) J. Biol Chem. 266, 17707–17712.22. Hooper, N.M., Karran, E.H. & Turner, A.J. (1997) Biochem. J. 321, 265–279.23. Black, R.A., Rauch, C.T., Kozlosky, C.J., Peschon, J.J., Slack, J.L., Wolfson, M.F., Castner, B.J., Stocking, K.L., Reddy, P., Srinivasan, S., et al.

(1997) Nature (London) 385, 729–733.24. Buxbaum, J.D., Liu, K.N., Luo, Y., Slack, J.L., Stocking, K.L., Peschon, J.J., Johnson, R.S., Castner, B.J., Cerretti, D.P. & Black, R.A. (1998) J.

Biol Chem. 273, 27765–27767.25. Arribas, J., Coodly, L., Vollmer, P., Kishimoto, T.K., Rose-John, S. & Massague, J. (1996) J. Biol Chem. 271, 11376–11382.26. Goate, A.M. (1998) Cell Mol. Life Sci. 54, 897–901.

CELLULAR MECHANISMS OF Β-AMYLOID PRODUCTION AND SECRETION 11052

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 100: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

27. Dovey, H.F., Suomensaari-Chrysler, S., Lieberburg, I., Sinha, S. & Keim, P.S. (1993) NeuroReport 4, 1039–1042.28. Suzuki, N., Cheung, T.T., Cai, X.D., Odaka, A., Otvos, L., Jr., Eckman, C., Golde, T.E. & Younkin, S.G. (1994) Science 264, 1336–1340.29. Games, D., Adams, D., Alessandrini, R., Barbour, R., Berthelette, P., Blackwell, C., Carr, T., Clemens, J., Donaldson, T., Gillespie, F., et al. (1995)

Nature (London) 373, 523–527.30. Chen, K.S., Masliah, E., Grajeda, H., Guido, T., Huang, J., Khan, K., Motter, R., Soriano, F. & Games, D. (1998) Prog. Brain Res. 117, 327–334.31. Sherrington, R., Rogaev, E.I., Liang, Y., Rogaeva, E.A., Levesque, G., Ikeda, M., Chi, H., Lin, C., Li, G., Holman, K., et al. (1995) Nature (London)

375, 754–760.32. Levy-Lahad, E., Wasco, W., Poorkaj, P., Romano, D.M., Oshima, J., Pettingell, W.H., Yu, C.E., Jondro, P.D., Schmidt, S.D., Wang, K., et al. (1995)

Science 269, 973–977.33. Lemere, C.A., Lopera, F., Kosik, K.S., Lendon, C.L., Ossa, J., Saido, T.C., Yamaguchi, H., Ruiz, A., Martinez, A., Madrigal, L., et al. (1996) Nat.

Med. 2, 1146–1150.34. Citron, M., Westaway, D., Xia, W., Carlson, G., Diehl, T., Levesque, G., Johnson-Wood, K., Lee, M., Seubert, P., Davis, A., et al. (1997) Nat. Med.

3, 67–72.35. Levitan, D. & Greenwald, I. (1995) Nature (London) 377, 351– 354.36. De Strooper, B., Saftig, P., Craessaerts, K., Vanderstichele, H., Guhde, G., Annaert, W., Von Figura, K. & Van Leuven, F. (1998) Nature (London)

391, 387–390.37. Knops, J., Lieberburg, I. & Sinha, S. (1992) J. Biol. Chem. 267, 16022–16024.38. Mehdi, S., Angelastro, M.R., Wiseman, J.S. & Bey, P. (1988) Biochem. Biophys. Res. Commun. 157, 1117–1123.39. Higaki, J., Quon, D., Zhong, Z. & Cordell, B. (1995) Neuron 14, 651–659.40. Hartmann, T., Bieger, S.C., Bruhl, B., Tienari, P.J., Ida, N., Allsop, D., Roberts, G.W., Masters, C.L., Dotti, C.G., Unsicker, K., et al. (1997) Nat.

Med. 3, 1016–1020.41. Wolfe, M.S., Xia, W., Ostaszewski, B.L., Diehl, T.S., Kimberley, W.T. & Selkoe, D.J. (1999) Nature (London) 398, 513–517.42. Lichtenthaler, S.F., Wang, R., Grimm, H., Uljon, S., Masters, C.L. & Beyreuther, K. (1999) Proc. Natl. Acad. Sci. USA 96, 3053–3058.43. Citron, M., Teplow, D.B. & Selkoe, D.J. (1995) Neuron 14, 661–670.44. Brown, A.M., Tummolo, D.M., Spruyt, M.A., Jacobsen, J.S. & Sonnenberg-Reines, J. (1996) J. Neurochem. 66, 2436–2445.45. Thompson, A., Grueninger-Leitch, F., Huber, G. & Malherbe, P. (1997) Brain Res. Mol. Brain Res. 48, 206–214.46. Sinha, S., Suomensaari, S., Keim, P., Jacobson-Croak, K., Zhao, J., Hu, K., Tan, H., Tatsuno, G., McConlogue, L., Lieberburg, I., et al. (1997) Soc.

Neurosci. Abstr. 23, 4.

CELLULAR MECHANISMS OF Β-AMYLOID PRODUCTION AND SECRETION 11053

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 101: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Reverse biochemistry: Use of macromolecular protease inhibitorsto dissect complex biological processes and identify a membrane-

type serine protease in epithelial cancer and normal tissue

This paper was presented at the National Academy of Sciences colloquium “Proteolytic Processing and Physiological Regulation,” held February 20–21, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA.

TOSHIHIKO TAKEUCHI*, MARC A. SHUMAN†, AND CHARLES S. CRAIK*‡

*Departments of Pharmaceutical Chemistry and Biochemistry & Biophysics, and †Department of Medicine, University ofCalifornia, San Francisco, CA 94143

ABSTRACT Serine proteases of the chymotrypsin fold are of great interest because they provide detailed understanding oftheir enzymatic properties and their proposed role in a number of physiological and pathological processes. We have beendeveloping the macromolecular inhibitor ecotin to be a “fold-specific” inhibitor that is selective for members of thechymotrypsin-fold class of proteases. Inhibition of protease activity through the use of wild-type and engineered ecotins resultsin inhibition of rat prostate differentiation and retardation of the growth of human PC-3 prostatic cancer tumors. In an effortto identify the proteases that may be involved in these processes, reverse transcription-PCR with PC-3 poly(A)+ mRNA wasperformed by using degenerate oligonucleotide primers. These primers were designed by using conserved protein sequencesunique to chymotrypsinfold serine proteases. Five proteases were identified: urokinase-type plasminogen activator, factor XII,protein C, trypsinogen IV, and a protease that we refer to as membrane-type serine protease 1 (MT-SP1). The cloning andcharacterization of the MT-SP1 cDNA shows that it encodes a mosaic protein that contains a transmembrane signal anchor,two CUB domains, four LDLR repeats, and a serine protease domain. Northern blotting shows broad expression of MT-SP1 ina variety of epithelial tissues with high levels of expression in the human gastrointestinal tract and the prostate. A His-taggedfusion of the MT-SP1 protease domain was expressed in Escherichia coli, purified, and autoactivated. Ecotin and variantecotins are subnanomolar inhibitors of the MT-SP1 activated protease domain, suggesting a possible role for MT-SP1 inprostate differentiation and the growth of prostatic carcinomas.

Serine proteases possessing a chymotrypsin fold are of great interest because they provide detailed understanding of theirenzymatic properties and their proposed role in a number of physiological and pathological processes. A wealth of information existson structure-function relationships regarding this large class of enzymes. Moreover, potent and specific inhibitors are readily availablefor use in dissecting the function of these enzymes. These proteases exist as precursors that are activated by specific and limitedproteolysis, allowing regulation of enzyme activity (1). Examples of this type of regulation include blood coagulation (2), fibrinolysis(3), complement activation (4), and trypsinogen activation by enteropeptidase in digestion (5). The precise control of these activationprocesses is crucial for normal physiological enzymatic function; misregulation of these enzymes can lead to pathological conditions (2–5).

We are interested in studying the role of these chymotrypsin-fold serine proteases in cancer by using a “fold-specific” inhibitor,ecotin (6, 7). Ecotin or engineered versions of ecotin can be introduced into complex biological systems as probes of proteolysis bythese chymotrypsin-fold proteases. If effects are observed on treatment with these unique inhibitors, then the large body of knowledgeconcerning the biochemistry of these proteases can be tapped to understand the structure and function of the target proteases. Forexample, the molecular cloning, structural modeling, and mechanistic understanding of the enzymes are immediately accessible. Werefer to this approach, which is analogous to “reverse genetics,” as “reverse biochemistry,” and we have applied it to identification ofspecific serine proteases in prostate cancer.

Urokinase-type plasminogen activator (uPA) has been implicated in tumor-cell invasion and metastasis. Cancer-cell invasion intonormal tissue can be facilitated by uPA through its activation of plasminogen, which degrades the basement membrane andextracellular matrix (reviewed in refs. 8 and 9). The role of other serine proteases in cancer has been less well characterized.

One useful model system for studying many issues that are pertinent to prostate cancer is the development of the rodent ventralprostate in explant cultures. Macromolecular inhibitors of serine proteases of the chymotrypsin fold, ecotin and ecotin M84R/M85R (6,7), inhibit ductal branching morphogenesis and differentiation of the explanted rat ventral prostate (F. Elfman, T.T., C.C., G. Cunha,and M.S., unpublished data). Ecotin M84R/M85R is a 2,800-fold more potent inhibitor of uPA than ecotin (1 nM vs. 2.8 µM) (6).However, inhibition of prostate differentiation was seen with both inhibitors, suggesting that uPA and other related serine proteases areinvolved in the differentiation and continued growth of the rat ventral prostate. Thus, unidentified serine proteases may play a role ingrowth and prevention of apoptosis in prostate epithelial cells in this system.

Another well characterized model that is derived from human prostate cancer epithelial cells is the PC-3 cell line (10). The PC-3cell line expresses uPA as assayed by ELISA and by Northern blotting of PC-3 mRNA (11). We found that the primary tumor size inPC-3-implanted nude mice was significantly smaller in both ecotin M84R/M85R and ecotin wild-type treated mice treated for 7 weekscompared with the primary tumor size of PBS-treated mice. Metastasis from the primary tumors were similarly lower in the inhibitor-treated

‡To whom reprint requests should be addressed. E-mail: [email protected] is available online at www.pnas.org.Abbreviations: MT-SP1, membrane-type serine protease 1; CUB, complement factor 1R-urchin embryonic growth factor-bone

morphogenetic protein; LDLR, low density lipoprotein receptor; uPA, urokinase-type plasminogen activator; pNA, p-nitroanilide.Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. Banklt257050

and AF133086).

REVERSE BIOCHEMISTRY: USE OF MACROMOLECULAR PROTEASE INHIBITORS TO DISSECT COMPLEX BIOLOGICALPROCESSES AND IDENTIFY A MEMBRANE-TYPE SERINE PROTEASE IN EPITHELIAL CANCER AND NORMAL TISSUE

11054

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 102: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

mice than in PBS-treated mice (O.Melnyk, T.T., C.C., and M.S., unpublished data). Inhibition was not unexpected with ecotin M84R/M85R treatment, because uPA has been implicated in metastasis. However, wild-type ecotin is a poor, micromolar inhibitor of uPA;one interpretation of the data is that the decrease in tumor size and metastasis in the mouse model involves the inhibition of additionalserine proteases. Thus, identification of the serine proteases expressed by PC-3 prostate cells may provide insight into the role of theseproteases in cancer and prostate growth and development. In this report we have extended the strategy of using PCR with degenerateoligonucleotide primers that were designed by using conserved sequence homology (12–14) to identify additional serine proteasesmade by cancer cells. Five independent serine protease cDNAs derived from PC-3 mRNA were sequenced, including a novel serineprotease, which we refer to as membrane-type serine protease 1 (MT-SP1), and the cloning and characterization of this cDNA thatencodes a mosaic, transmembrane protease is reported.

MATERIALS AND METHODS

Materials. All primers used were synthesized on a Applied Biosystems 391 DNA synthesizer. All restriction enzymes werepurchased from New England Biolabs. Automated DNA sequencing was carried out on an Applied Biosystems 377 Prism sequencer,and manual DNA sequencing was carried out under standard conditions. N-terminal amino acid sequencing was performed on an ABI477A by the University of California, San Francisco Biomolecular Resource Center. The synthetic substrates, Suc-AAPX-p-nitroanilide (pNA), [N-succinylalanyl-alanyl-prolyl-Xxx-pNA (Xxx=alanyl, aspartyl, glutamyl, phenylalanyl, leucinyl, methionyl, orarginyl)], and H-Arg-pNA, (arginyl-pNA), were purchased from Bachem. Deglycosylation was performed by using PNGase F (NEB,Beverly, MA). All other reagents were of the highest quality available and purchased from Sigma or Fisher unless otherwise noted.

Isolation of cDNA from PC-3 Cells. mRNA was isolated from PC-3 cells by using the polyATtract System 1000 kit (Promega).Reverse transcription was primed by using the “lock-docking” oligo(dT) primer (15). Superscript II reverse transcriptase (LifeTechnologies, Grand Island, NY) was used in accordance with the manufacturer’s instructions to synthesize the cDNA from the PC-3mRNA.

Amplification of MT-SP1 Gene. The degenerate primers used for amplifying the protease domains were designed from theconsensus sequences flanking the catalytic histidine (5� His-primer) and the catalytic serine (3� Ser-primer), similar to those described(12). The 5� primer used is as follows: 5�-TGG (AG)TI (CAG)TI (AT)(GC)I GCI (GA)CI CA(CT) TG-3�, where nucleotides inparentheses represent equimolar mixtures and I represents deoxyinosine. This primer encodes at least the following amino acidsequence: W (I/V) (I/V/L/M) (S/T) A (A/T) H C. The 3� primer used is as follows: 5�-IGG ICC ICC I(GC)(AT) (AG)TC ICC (CT)TI(GA)CA IG(ATC) (GA)TC-3�. The reverse complement of the 3� primer encodes at least the following amino acid sequence: D (A/S/T)C (K/E/Q/H) G D S G G P.

Direct amplification of serine protease cDNA was not possible by using the above primers. Instead, the first PCR was performedwith the 5� His-primer and the oligo(dT) primer described above, by using the “touchdown” PCR protocol (16), with annealingtemperatures decreasing from 52°C to 42°C over 22 rounds and 13 final rounds at 54°C annealing temperature. Cycle times were 1 min(denaturing), 1 min (annealing), and 2 min (extension) and were followed by one final extension time of 15 min after the final round ofPCR. The template for the second PCR was 0.5 µl (total reaction volume 50 µL) of a 1:10 dilution of the first PCR mixture that wasperformed with the 5� His-primer and the oligo(dT). The second PCR reaction was primed with the 5� His- and the 3� Ser-primers andperformed by using the touchdown protocol described above. All PCRs used 12.5 pmol of primer for 50-µl reaction volume.

The product of the second reaction was purified on a 2% agarose gel, and all products between 400 and 550 bp were cut from thegel and extracted by using the QIAquick gel extraction kit (Qiagen, Chatsworth, CA). These products were digested with the BamHIrestriction enzyme to cut any uPA cDNA, and all 400- to 500-bp fragments were repurified on a 2% agarose gel. These reactionproducts were subjected to a third PCR by using the 5� His-primer and the 3� Ser-primer by using the identical touchdown procedure.These reaction products were gel-purified and directly cloned into the pPCR2.1 vector by using the TOPO TA ligation kit (Invitrogen).DNA sequencing of the inserts determined the cDNA sequence from nucleotides 1,984 to 2,460 (see Fig. 1).

Northern Blot Analysis. 32P-labeled nucleotides were purchased from Amersham Pharmacia. A cDNA fragment containingnucleotides 1,173–2,510 was digested from expressed sequence tag w39209 by using restriction enzymes EcoRI and BsmbI, yielding a1.3-kilobase nucleotide insert. Labeled cDNA probes were synthesized by using the Rediprime random primer labeling kit (AmershamPharmacia) and 20 ng of the purified insert. Poly(A)+RNA membranes for Northern blotting were purchased from Origene (Rockville,MD; HB-1002, HB-1018) and CLONTECH (Human II 7759–1, Human Cancer Cell Line 7757). The blots were performed understringent annealing conditions as described in ref. 17.

Construction of Expression Vectors. The mature protease domain and a small portion of the pro-domain (nucleotides 1,822–2,601) cDNA were amplified by using PCR from expressed sequence tag w39209 and ligated into the pQE30 vector (Qiagen). Thisconstruct is designed to overexpress the protease sequence from amino acids (aa) 596–855 with the following fusion: Met-Arg-Gly-Ser-His6-aa596–855. The Histag fusion allows affinity purification by using metal-chelate chromatography. The change from Ser-805,encoded by TCC, to Ala (GCT) was performed by using PCR. The presence of the correct Ser → Ala substitution in the pQE30 vectorwas verified by DNA sequence analysis.

Expression and Purification of the Protease Domain. The above-mentioned plasmids were separately transformed intoEscherichia coli X-90 to afford high-level expression of recombinant protease gene products (18). Expression and purification of therecombinant enzyme from solubilized inclusion bodies was performed as described (19). Protein-containing fractions were pooled anddialyzed overnight at 4°C against 50 mM Tris (pH 8), 10% glycerol, 1 mM 2-mercaptoethanol, and 3 M urea. Autoactivation of theprotease was monitored on dialysis against storage buffer (50 mM Tris, pH 8/10% glycerol) at 4°C by using the substrate SpectrozymetPA (hexahydrotyrosyl-Gly-Arg-pNA, American Diagnostics, Greenwich, CT). Hydrolysis of Spectrozyme tPA was monitored at 405nM for the formation of p-nitroaniline by using a Uvikon 860 spectrophotometer. Activated protease was bound to an immobilized p-aminobenzamidine resin (Pierce) that had been equilibrated with storage buffer. Bound protease was eluted with 100 mM benzamidineand the protein containing fractions were pooled. Excess benzamidine was removed by using FPLC with a Superdex 70 (AmershamPharmacia) gel filtration column that was equilibrated with storage buffer. Protein containing fractions were pooled and stored at –80°C. The cleavage of the purified Ser805 Ala protease domain was performed at 37°C by addition of active recombinant protease domainto 10 nM. Cleavage was monitored by using SDS/ PAGE.

Determination of Substrate Kinetics. The purified serine protease domain was titrated with 4-methylumbelliferylpguanidinobenzoate (MUGB) to obtain an accurate concen-

REVERSE BIOCHEMISTRY: USE OF MACROMOLECULAR PROTEASE INHIBITORS TO DISSECT COMPLEX BIOLOGICALPROCESSES AND IDENTIFY A MEMBRANE-TYPE SERINE PROTEASE IN EPITHELIAL CANCER AND NORMAL TISSUE

11055

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 103: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

tration of enzyme active sites (20). Enzyme activity was monitored at 25°C in assay buffer containing 50 mM Tris (pH 8.8), 50 mMNaCl, and 0.01% Tween 20. The final concentration of substrate Spectrozyme tPA ranged from 1 to 400 µM. Enzyme concentrationsranged from 40 to 800 pM. Active-site titrations were performed on a Fluoromax-2 spectrofluorimeter. Measurements were plotted byusing the KALEIDAGRAPH program (Synergy Software, Reading, PA), and the Km, Acat, and kcat/Km for Spectrozyme tPA wasdetermined by using the Michaelis-Menten equation.

FIG. 1. Nucleotide sequence of the cDNA encoding human MT-SP1 and predicted protein sequence. Numbering indicatesnucleotide or amino acid residue. Amino acids are shown in single-letter code. The termination codon is shown by *. Theunderlined stop codon at nucleotide 10 is in frame with the initiating methionine. The Kozak consensus sequence (24) at thestart codon is underlined at nucleotide 32. The predicted N-glycosylation sites at amino acids 109, 302, 485, and 772 areunderlined. A possible polyadenylation sequence (46) at nucleotide 3,120 is also underlined. The catalytic triad in the serineprotease domain is highlighted: His-656, Asp-711, and Ser-805.

Inhibition of MT-SP1 Protease Domain with Ecotin and Ecotin M84R/M85R. Ecotin and ecotin M84R/M85R were purifiedfrom E.coli as described (6). Various concentrations of ecotin or ecotin M84R/M85R were incubated with the His-tagged serineprotease domain in a total volume of 990 µl of buffer containing 50 mM NaCl, 50 mM Tris·HCl (pH 8.8), and 0.01% Tween 20. Tenmicroliters of Spectrozyme tPA was added, yielding a solution containing 100 µM substrate. The final enzyme concentration was 63pM, and the ecotin and ecotin M84R/M85R concentration ranged from 0.1 to 50 nM. The data were fit to the equation derived forkinetics of reversible tight-binding inhibitors (21, 22), and the values for apparent Ki were determined.

RESULTS

Cloning of Serine Protease Domain cDNAs from PC-3 Cells and Amplification of MT-SP1 cDNA. PCR amplification ofserine protease cDNA was performed by using “consensus cloning”, where the amplification was performed with degenerate primersdesigned to anneal to cDNA encoding the region about the conserved catalytic histidine (5� His-primer) and the conserved catalyticserine (3� Ser-primer). The consensus primers were designed by using 37 human sequences within a sequence alignment of 242 serineproteases of the chymotrypsin fold that are reported in the SwissProt database. To bias the screen for previously unidentified proteasesin the PC-3 cDNA, uPA cDNA was cut and removed by using the known BamHI endonuclease site in the uPA cDNA sequence. Theexpected size of the cDNA fragments amplified between His-57 and Ser-195 cDNA (standard chymotrypsinogen numbering) isbetween 400 and 550 bp; statistically, only 1 in 10 cDNAs of that length will be cleaved by BamHI. Thus, cDNAs obtained from thePCR reactions with the 5� His-primer and 3� Ser-primer were size selected for the 400- to 550-bp range, digested with BamHI, andpurified from any digested cDNAs. After a subsequent round of PCR, the products were cloned into pPCR2.1 (Fig. 2). Twenty cloneswere digested with EcoRI to monitor the size of the cDNA insert. Three clones lacked inserts of the correct size. The remaining 17clones containing inserts between 400 and 550 bp were sequenced. BLAST searches of the resulting sequences revealed that six clonesdid not match serine protease sequences. The remaining cDNAs yielded clones corresponding to factor XII (two clones), protein C (twoclones), trypsinogen type IV (two clones), uPA (one clone), and MT-SP1 (four clones). Additional serine protease sequences may nothave been found because they were digested by BamHI, lost in the size selection, or present in lower frequencies.

REVERSE BIOCHEMISTRY: USE OF MACROMOLECULAR PROTEASE INHIBITORS TO DISSECT COMPLEX BIOLOGICALPROCESSES AND IDENTIFY A MEMBRANE-TYPE SERINE PROTEASE IN EPITHELIAL CANCER AND NORMAL TISSUE

11056

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 104: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

FIG. 2. Lane 1 shows the PCR products obtained by using degenerate primers designed from the consensus sequencesflanking the catalytic histidine (5� His-primer) and the catalytic serine (3� Serprimer). The products remaining between 400and 550 bp after digestion with BamHI were reamplified by using the same degenerate primers. The products from thissecond PCR are shown in Lane 2.

Multiple expressed sequence tag sequences were found for the cDNA. Expressed sequence tag accessions aa459076, aa219372,and w39209 were used extensively for sequencing the cDNA starting from nucleotide 746 and 2, 461–3, 142, but no start codon wasobserved. A sequence was also found in GenBank (accession no. U20428). This sequence also lacks the 5� end of the cDNA butallowed amplification of cDNA from nucleotides 196–745. Rapid amplification of cDNA ends (RACE) (23) was used to obtain further5� cDNA sequence. Application of RACE did not yield a clone containing the entire 5�-untranslated region, but the sequence obtainedcontained a stop codon in-frame with the Kozak start sequence (24), giving confidence that the full coding sequence of the cDNA hasbeen obtained. The nucleotide sequence and predicted amino acid sequence are shown in Fig. 1.

The nucleotide sequence surrounding the proposed start codon matches the optimal sequence of ACCATGG for translationinitiation sites proposed by Kozak (24). In addition, there is a stop codon in-frame with the putative start codon, which gives furtherevidence that initiation occurs at that site. The DNA sequence predicts an 855-aa mosaic protein composed of multiple domains(Fig. 3). The coding sequence does not contain a typical signal peptide but does contain a single hydrophobic sequence of 26 residues(residues 55–81), which is flanked by a charged residue on each side. This sequence may constitute a signal anchor sequence, similar tothat observed in other proteases, including hepsin (25) and enteropeptidase (26). Following the putative signal anchor sequence are twocomplement factor 1R-urchin embryonic growth factor-bone morphogenetic protein (CUB) domains (27), which are named after theproteins in which the modules were first discovered: complement subcomponents C1s and C1r, urchin embryonic growth factor (Uegf),and bone morphogenetic protein 1 (BMP1). CUB domains have conserved characteristics, which include the presence of four cysteineresidues and various conserved hydrophobic and aromatic positions (27). The CUB domain, which has recently been characterizedcrystallographically (28), consists of 10 β-strands that are organized into two 5-stranded β-sheets. Following the CUB domains are fourlow-density lipoprotein receptor (LDLR) repeats (29), which are named after the receptor ligand-binding repeats that are present in theLDLR. These repeats have a highly conserved pattern and spacing of six cysteine residues that form three intramolecular disulfidebonds. The final domain observed is the serine protease domain. The alignments of these domains with other members of theirrespective classes are shown in Fig. 4.

FIG. 3. The domain structure of human MT-SP1 is compared with the domain structure of enteropeptidase (47) and hepsin(25). SA, possible signal anchor; CUB, a repeat first identified in complement components C1r and C1s, the urchinembryonic growth factor and bone morphogenetic protein 1 (27); L, LDLR repeat (29); SP, a chymotrypsin family serineprotease domain (40); MAM, a domain homologous to members of a family defined by meprin, protein A5, and the proteintyrosine phosphatase µ (48); MSCR, a macrophage scavenger receptor cysteine-rich motif (29). The predicted disulfidelinkages are shown labeled as C–C.

Tissue Distribution of MT-SP1 mRNA. Northern blots of human poly(A)+RNA, made by using a 1.3kilobase fragment of MT-SP1 cDNA fragment as a probe, show a �3.3-kilobase fragment appearing in epithelial tissues including the prostate, kidney, lung,small intestine, stomach, colon, and placenta, as well as other tissues, including spleen, liver, leukocytes, and thymus. This band wasnot observed in muscle, brain, ovary, or testis (Fig. 5). Similar experiments performed on a human cancer cell line blot shows that MT-SP1 is expressed in the colorectal adenocarcinoma, SW480, but was not observed in the promyelocytic leukemia HL-60, HeLa cell S3,chronic myelogenous leukemia K-562, lymphoblastic leukemia MOLT-4, Burkitt’s lymphoma Raji, lung carcinoma A549, ormelanoma G361 lanes (data not shown). This 3.3-kilobase mRNA fragment is slightly longer than the 3.1-kilobase sequence presentedin Fig. 5, suggesting that there may still be sequence in the 5�-untranslated region that has not been identified.

Activation and Purification of His-MT-SP1 Protease Domain. The serine protease domain of MT-SP1 was expressed in E.colias a His-tagged fusion and was purified from inclusion bodies under denaturing conditions by using metal-chelate affinitychromatography. The yield of enzyme after this step was �3 mg of protein per liter of E.coli culture. This denatured protein refoldedwhen the urea was dialyzed from the protein. Surprisingly, the purified renatured protein showed a time-dependent shift on an SDS/PAGE gel (Fig. 6A), with the lower fragment being the size of the mature, processed enzyme lacking the His tag. N-terminalsequencing of the purified, activated protease domain yielded the expected VVGGT activation sequence. When the refolded proteinwas tested for activity by using the synthetic substrate Spectrozyme tPA, a time-dependent increase in activity was observed (Fig. 6B).In contrast, the protease domain that contains the Ser805 Ala mutation showed neither a change in size on an SDS polyacrylamide gelnor an increase in enzymatic activity under identical conditions (data not shown), suggesting that the catalytic serine is necessary foractivation and is not the result of a contaminating protease. To show that the cleavage of the protease domain was a result of His-taggedMT-SP1 protease activity, the inactive Ser805 Ala protease domain was treated with purified recombinant enzyme (Fig. 6C). Thistreatment results in the formation of a cleavage product that corresponds to the size of the active protease (Fig. 6C, lane 7). Untreatedprotease domain does not get cleaved (Fig. 6C, lane 8). From these results, it is concluded that the protease autoactivates on

REVERSE BIOCHEMISTRY: USE OF MACROMOLECULAR PROTEASE INHIBITORS TO DISSECT COMPLEX BIOLOGICALPROCESSES AND IDENTIFY A MEMBRANE-TYPE SERINE PROTEASE IN EPITHELIAL CANCER AND NORMAL TISSUE

11057

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 105: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

refolding. The activated protease was separated from inactive protein and other contaminants by using affinity chromatography with p-aminobenzamidine resin. Purified protein was analyzed by using SDS/PAGE, and no other contaminants were observed. Similarly,immunoblotting with polyclonal antiserum against purified protease domain (raised in rabbits at Berkeley Antibody, Richmond, CA)revealed one band. Under nonreducing conditions, the pro region is disulfide-linked to the protease domain; thus, this purified proteinwas also immunoreactive with the mAb (Qiagen, Chatsworth, CA) directed against the N-terminal Arg-Gly-Ser-His4 epitope that iscontained in the recombinant protease domain, further indicating the purity and identity of the protein (data not shown).

FIG. 4. Multiple sequence alignments of MT-SP1 structural motifs. L, loops; β, B-sheets; α, α-helices; S-S, disulfides. (A)Multiple sequence alignment of the serine protease domain of MT-SP1 with human trypsinogen B (49), human enterokinase(47), human hepsin (25), human tryptase 2 (50), and human chymotrypsinogen B (51), with standard chymotrypsinnumbering. Conserved catalytic and structural residues described in the text are underlined. (B) Alignment of MT-SP1 LDLRwith domains of the LDLR (52). (C) Alignment of the CUB domains of MT-SP1 with those found in human enterokinase(48), human bone morphogenetic protein 1 (53), and complement component C1R (54).

Kinetic Properties of Purified His-MT-SP1 Protease Domain. The enzyme concentration was determined by using an activesite titration with MUGB. The catalytic activity of the protease domain was monitored by using pNA substrates.

REVERSE BIOCHEMISTRY: USE OF MACROMOLECULAR PROTEASE INHIBITORS TO DISSECT COMPLEX BIOLOGICALPROCESSES AND IDENTIFY A MEMBRANE-TYPE SERINE PROTEASE IN EPITHELIAL CANCER AND NORMAL TISSUE

11058

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 106: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

Purified protease domain was tested for hydrolytic activity against tetrapeptide substrates of the form Suc-AAPX-pNA, whichcontained various amino acids at the P1 position (P1-Ala, Asp, Glu, Phe, Leu, Met, Lys, or Arg). The only substrates with detectableactivity were those with P1-Lys or P1-Arg. The serine protease domain with the Ser805 Ala mutation had no detectable activity. Theactivity of the protease domain was further characterized by using the substrate Spectrozyme tPA, yielding: Km=31.4±4.2µM,kcat=2.6×102±6.5 s–1, and kcat/Km=6.9×106±2.3×106 M–1.s–1. Ecotin inhibition of the MT-SP1 His-tagged protease domain fits a tight-binding reversible inhibitory model (21, 22) as observed for ecotin interaction with other serine protease targets (6, 7, 30). Inhibitionassays by using ecotin and ecotin M84R/M85R yielded apparent Ki values of 782±92 pM and 9.8±1.5 pM, respectively.

FIG. 5. Tissue distribution of MT-SP1 mRNA levels. Northern blots of human poly(A)+RNA from assorted human tissueswas hybridized with radiolabeled cDNA probes as described in Materials and Methods. Upper shows hybridization by usinga MT-SP1 1.3-kilobase cDNA fragment derived from expressed sequence tag clone w39209 and exposed overnight. Lowershows the same blot after being stripped and rehybridized with a loading standard β-actin (A) or human glyceraldehydephosphate dehydrogenase (GAPDH) (B) cDNA probe exposed for 2 hours. The mobility of RNA size standards is indicatedat the left.

DISCUSSION

Structural Motifs of MT-SP1. In this work, we characterize the expression of chymotrypsin-fold proteases by PC-3 cells andcloned a member of this family we call MT-SP1. The name membrane-type serine protease 1 (MT-SP1) is given to be consistent withthe nomenclature of the membrane-type matrix metalloproteases (MT-MMPs; ref. 32). The cDNA likely encodes a membrane-typeprotein because of the lack of a signal sequence and the presence of a putative SA that is also seen in other membrane-type serineproteases hepsin (25), enteropeptidase (26), and TMPRSS2 (32), and human airway trypsin-like protease (33). We propose that proteinsthat are localized to the membrane through a SA and that encode a chymotrypsin fold serine protease domain be categorized in the MT-SP family. The membrane localization of MT-SP1 is supported by immunofluorescence experiments that localize the protease domainto the extracellular cell surface (unpublished results).

Following the putative SA are several domains that are thought to be involved in protein-protein interactions or protein-ligandinteractions. For example, CUB domains can mediate protein-protein interactions as with the seminal plasma PSP-I/PSP-II heterodimerthat is built by CUB-domain interactions (28) and with procollagen C-proteinase enhancer protein and procollagen C-proteinase(BMP-1) (34, 35). Interestingly, most of the proteins that contain CUB domains are involved in developmental processes or areinvolved in proteolytic cascades (27), which suggests that MTSP1 may play a similar role. The four repeated motifs that follow theCUB domains are known as LDLR ligand-binding repeats, named after the seven copies of repeats found in the LDLR. There areseveral negatively charged amino acids between the fourth and sixth cysteines that are highly conserved in the LDLR and are also seenin the LDLR repeats of MT-SP1. The conserved motif Ser-Asp-Glu (residues 44–46 in Fig. 4) are known to be important for bindingthe positively charged residues of the LDLR ligands apolipoprotein B-100 (ApoB-100) and ApoE (29). The ligand-binding repeats ofMT-SP1 most likely do not mediate interaction with ApoB-100 or ApoE but may be involved in the interaction with other positivelycharged ligands. For example, LDLR repeats in the LDLR-related protein have been implicated the binding and recycling of protease-inhibitor complexes such as uPA-plasminogen activator inhibitor-1 (PAI-1) complexes (reviewed in refs. 36 and 37). It also has beenshown that the pro domain of enteropeptidase is involved in interactions with its substrate trypsinogen, allowing 520-fold greatercatalytic efficiency in the cleavage compared with the protease domain alone (38). By analogy, similar interactions should occurbetween MT-SP1 and its substrates. Thus, further investigation of MT-SP1 CUB domain or LDLR repeat interactions may yield insightinto the function of this protein.

FIG. 6. Activation and purification of His-tagged MT-SP1 protease domain. A representative experiment is shown in A andB. (A) Activation at 4°C was monitored by using SDS/PAGE. The upper band represents inactivated protease domain, and thelower band represents active protease (also verified by N-terminal sequencing). (B) The activation of the protein wasmonitored by using Spectrozyme tPA as a synthetic substrate for the protease domain. (C) Inactive Ser805 Ala proteasedomain is cleaved with 10 nM activated His-tagged MT-SP1 protease domain at 37°C. The specific cleavage of active MT-SP1 protease domain is required for proper processing at the activation site. Active protease domain is shown in lane 7 (+),and no cleavage of the untreated inactive protease domain is observed (lane 8, –).

The amino acid sequence of the serine protease domain of MT-SP1 is highly homologous to other proteases found in the family(Fig. 4). The essential features of a functional serine protease are contained in the deduced amino acid sequence of

REVERSE BIOCHEMISTRY: USE OF MACROMOLECULAR PROTEASE INHIBITORS TO DISSECT COMPLEX BIOLOGICALPROCESSES AND IDENTIFY A MEMBRANE-TYPE SERINE PROTEASE IN EPITHELIAL CANCER AND NORMAL TISSUE

11059

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 107: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

the domain. The residues that comprise the catalytic triad, His-656, Asp-711, and Ser-805, corresponding to His-57, Asp-102, andSer-195 in chymotrypsin, are observed in MT-SP1 (for reviews, see refs. 39 and 40). The sequence Ser214Trp215Gly216

(Ser825Trp826Gly827), which is thought to interact with the side chains of the substrate for properly orienting the scissile bond is present.Gly-193 (Gly-803) and Gly-196 (Gly-805), which are thought to be necessary for proper orientation of Ser-195 (Ser-805), also arepresent. Based on homology to chymotrypsin, three disulfide bonds are predicted to form within the protease domain at Cys-44–Cys-58, Cys-168–Cys-182, and Cys-191–Cys-220 (Cys-643–Cys-657, Cys-776–Cys-790, and Cys-801–Cys-830), and a fourthdisulfide bond should form between the catalytic and the pro-domain Cys-122–Cys-1 (Cys-731–Cys-604), as observed forchymotrypsin. This predicted disulfide with the pro domain suggests that the active catalytic domain should still be localized to the cellsurface via a disulfide linkage. The presence of the catalytic machinery and other conserved structural components described abovesuggest that all features necessary for proteolytic activity are present in the encoded sequence.

Substrate Specificity of the MT-SP1 Protease Domain. The S1 site specificity (41) of a protease is largely determined by theamino acid residue at position 189. This position is occupied by an aspartate in MT-SP1, suggesting that the protease has specificity forArg/Lys in the P1 position. In addition, the presence of a polar Gln-192 (Gln-803), as in trypsin, is consistent with basic specificity.Furthermore, the presence of Gly-216 (Gly-827) and Gly-226 (Gly-837) is consistent with the presence of a deep S1 pocket, unlikeelastase, which has Val-216 and Thr-226 that block the pocket and thereby contribute to the P1 specificity for small hydrophobic sidechains. The specificity at the other subsites is largely dependent on the nature of the seven loops A–E and loops 2 and 3 (Fig. 4). LoopC in enterokinase has a number of positively charged residues that are thought to interact with the negatively charged activation site intrypsinogen, Asp-Asp-Asp-Asp-Lys (26). One known substrate for MT-SP1 (as described below) is the activation site of MT-SP1,which is Arg-Gln-Ala-Arg (residues 611–614). Loop C contains two Asp residues that may participate in the recognition of theactivation sequence.

One means of obtaining further data on substrate specificity is by characterization of the activity of the recombinant proteolyticdomain. Enterokinase has been characterized from both recombinant (38, 42) and native (43, 44) sources. However, proteolytic activityfor the other reported membrane-type serine proteases hepsin (25) and TMPRSS2 (32) are only predicted based on sequence homology.To produce active recombinant MT-SP1, a His-tagged fusion of the protease domain was cloned into an E.coli vector and expressedand purified to homogeneity. Fortuitously, the protease domain refolded and autoactivated after resuspension and purification frominclusion bodies. This activity, coupled with the lack of activity in the Ser195Ala (Ser805Ala) variant, demonstrates that the cDNAencodes a catalytically proficient protease. Autoactivation of the protease domain at the arginine-valine site (Arg614-Val615) shows thatthe protease has Arg/Lys specificity as predicted by the sequence homology to other proteases of basic specificity. Specificity andselectivity are confirmed by the lack of cleavage of AAPX-pNA substrates that do not have x=R, K. Further characterization withSpectrozyme tPA revealed an active enzyme with kcat=2.6×102 s–1. However, the His-tagged serine protease domain does not cleave H-Arg-pNA, showing that, unlike trypsin, there is a requirement for additional subsite occupation for catalytic activity. This suggests thatthe enzyme is involved in a regulatory role that requires selective processing of particular substrates rather than nonselectivedegradation.

MT-SP1 Function. In other studies, we have found that inhibition of serine protease activity by ecotin or ecotin M84R/M85Rinhibits testosterone-induced branching ductal morphogenesis and enhances apoptosis in a rat ventral prostate model (F.Elfman, T.T.,C.S.C., G.Cunha, and M.A.S., unpublished results). Moreover, the rat homolog of MT-SP1 is expressed in the normal rat ventralprostate (data not shown). Assays of the protease domain with ecotin and ecotin M84R/ M85R showed that the enzymatic activity isstrongly inhibited (782±92 pM and 9.8±1.5 pM, respectively), suggesting that rat MT-SP1 is likely to be inhibited at the concentrationsof these inhibitors used in our experiments. MT-SP1 inhibition may result in the observed inhibition of differentiation and/or increasedapoptosis. Future studies are aimed at definitively resolving the role of MT-SP1 in prostate differentiation. The broad expression of MT-SP1 in epithelial tissues is consistent with the possibility that it is involved in cell maintenance or growth, perhaps by activating growthfactors or by processing prohormones.

MT-SP1 may participate in a proteolytic cascade that results in cell growth and/or differentiation. Another structurally similarmembrane-type serine protease, enteropeptidase (Fig. 3), is involved in a proteolytic cascade by which activation of trypsinogen leadsto activation of downstream intestinal proteases (5). Enteropeptidase is expressed only in the enterocytes of the proximal smallintestine, thus precisely restricting activation of trypsinogen. Thus, in contrast to secreted proteases that may diffuse throughout theorganism, the membrane association of MT-SP1 should also allow the proteolytic activity to be precisely localized, which may beimportant for proper physiological function; improper localization of the enzyme, or levels of downstream substrates could lead todisease.

We have found subcutaneous coinjection of PC-3 cells with wild-type ecotin or ecotin M84R/M85R led to a decrease in theprimary tumor size compared with animals in whom PC-3 cells and saline were injected (O.Melnyk, T.T., C.S.C. and, M.A.S.,unpublished results). Because wild-type ecotin is a poor, micromolar inhibitor of uPA, serine proteases other than uPA likely areinvolved in this primary tumor proliferation. Both wild-type ecotin and ecotin M84R/M85R are potent, subnanomolar inhibitors of MT-SP1, raising the possibility that MT-SP1 plays an important role in progression of epithelial cancers expressing this protease.

Direct biochemical isolation of the substrates may be possible if MT-SP1 adhesive domains such as the CUB domains or LDLRrepeats interact with the substrates. In addition, likely substrates may be predicted and tested for by using knowledge of extendedenzyme specificity. For example, the characterization of the substrate specificity of granzyme B allowed the prediction andconfirmation of substrates for this serine protease (45). Thus, these complimentary studies should further shed light on thephysiological function of this enzyme.

We thank Marion Conn, Robert Maeda, Todd Pray, Ibrahim Adiguzel, and Ralph Reid for technical assistance and helpfuldiscussions. T.T. was supported by a National Institutes of Health postdoctoral fellowship CA71097, and this work was supported byNational Institutes of Health Grant CA72006.1. Neurath, H. & Walsh, K.A. (1976) Proc. Natl. Acad. Sci. USA 73, 3825–3832.2. Davie, E.W., Fujikawa, K. & Kisiel, W. (1991) Biochemistry 30, 10363–10370.3. Chandler, W.L. (1996) Crit. Rev. Oncol. Hematol. 24, 27–45.4. Reid, K.B.M. & Porter, R.R. (1981) Annu. Rev. Biochem. 50, 433–464.5. Huber, R. & Bode, W. (1978) Acc. Chem. Res. 11, 114–122.6. Wang, C.-I., Yang, Q. & Craik, C.S. (1995) J. Biol. Chem. 270, 12250–12256.7. Yang, S.Q., Wang, C.-I., Gillmor, S.A., Fletterick, R.J. & Craik, C.S. (1998) J. Mol. Biol 279, 945–957.

REVERSE BIOCHEMISTRY: USE OF MACROMOLECULAR PROTEASE INHIBITORS TO DISSECT COMPLEX BIOLOGICALPROCESSES AND IDENTIFY A MEMBRANE-TYPE SERINE PROTEASE IN EPITHELIAL CANCER AND NORMAL TISSUE

11060

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.

Page 108: (NAS Colloquium) Proteolytic Processing and Physiological Regulation

8. Dano, K.Andreasen, P.A., Grondahl-Hansen, J., Kristensen, P., Nielsen, L.S. & Skriver, L. (1985) Adv. Cancer Res. 44, 139–266.9. Andreasen, P.A., Kjoller, L., Christensen, L. & Duffy, M.J. (1997) Int. J. Cancer 72, 1–22.10. Kaighn, M.E., Narayan, K.S., Ohnuki, Y., Lechner, J.F. & Jones, L.W. (1979) Invest. Urol. 17, 16–23.11. Yoshida, E., Verrusio, E.N., Mihara, H., Oh, D. & Kwaan, H.C. (1994) Cancer Res. 54, 3300–3304.12. Sakanari, J.A., Staunton, C.E., Eakin, A.E., Craik, C.S. & McKerrow, J.H. (1989) Proc. Natl. Acad. Sci. USA 86, 4863– 4867.13. Wiegand, U., Corbach, S., Minn, A., Kang, J. & Muller-Hill, B. (1993) Gene 136, 167–175.14. Kang, J., Wiegand, U. & Muller-Hill, B. (1992) Gene 110, 181–187.15. Borson, N.D., Salo, W.L. & Drewes, L.R. (1992) PCR Methods Appl. 2, 144–148.16. Don, R.H., Cox, P.T., Wainwright, B.J., Baker, K. & Mattick, J.S. (1991) Nucleic Acids Res. 19, 4008.17. Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A. & Struhl, K., eds. (1990) Current Protocols in Molecular Biology

(Wiley, New York).18. Evnin, L.B., Vasquez, J.R. & Craik, C.S. (1990) Proc. Natl. Acad. Sci. USA 87, 6659–6663.19. Unal, A., Pray, T.R., Lagunoff, M., Pennington, M.W., Ganem, D. & Craik, C.S. (1997) J. Virol. 71, 7030–7038.20. Jameson, G.W., Roberts, D.V., Adams, R.W., Kyle, W.S.A. & Elmore, D.T. (1973) Biochem. J. 131, 107–117.21. Morrison, J.F. (1969) Biochim. Biophys. Acta 185, 269–286.22. Williams, J.W. & Morrison, J.F. (1979) Methods Enzymol. 63, 437–467.23. Frohman, M.A. (1993) Methods Enzymol. 218, 340–356.24. Kozak, M. (1991) J. Cell Biol. 115, 887–903.25. Leytus, S.P., Loeb, K.R., Hagen, F.S., Kurachi, K. & Davie, E.W. (1988) Biochemistry 27, 1067–1074.26. Kitamoto, Y., Yuan, X., Wu, Q., McCourt, D.W. & Sadler, J.E. (1994) Proc. Natl. Acad. Sci. USA 91, 7588–7592.27. Bork, P. & Beckmann, G. (1993) J. Mol. Biol 231, 539–545.28. Varela, P.F., Romero, A., Sanz, L., Romao, M.J., Topfer-Petersen, E. & Calvete, J.J. (1997) J. Mol. Biol 274, 635–649.29. Krieger, M. & Herz, J. (1994) Annu. Rev. Biochem. 63, 601–637.30. Seymour, J.L., Lindquist, R.N., Dennis, M.S., Moffat, B., Yansura, D., Reilly, D., Wessinger, M.E. & Lazarus, R.A. (1994) Biochemistry 33, 3949–

3958.31. Nagase, H. (1997) Biol Chem. 378, 151–160.32. Poloni-Giacobino, A., Chen, H., Peitsch, M.C., Rossier, C. & Antonarkis, S.E. (1997) Genomics 44, 309–320.33. Yamakoka, K., Masuda, K., Ogawa, H., Takagi, K., Umemoto, N. & Yasuoka, S. (1998) J. Biol. Chem. 273, 11895–11901.34. Kessler, E. & Adar, R. (1989) Eur. J. Biochem. 186, 115–121.35. Hulmes, D.J.S., Mould, A.P. & Kessler, E. (1997) Matrix Biol. 16, 41–45.36. Strickl, D.K., Kounnas, M.Z. & Argraves, W.S. (1995) FASEB J. 9, 890–898.37. Moestrup, S.K. (1994) Biochim. Biopys. Acta 1197, 197–213.38. Lu, D., Yuan, X., Zheng, X. & Sadler, J.E. (1997) J. Biol Chem. 272, 31293–31300.39. Perona, J.J. & Craik, C.S. (1995) Protein Sci. 4, 337–360.40. Perona, J.J. & Craik, C.S. (1997) J. Biol Chem. 272, 29987– 29990.41. Schecter, I. & Berger, A. (1967) Biochem. Biophys. Res. Commun. 27, 157–162.42. LaVallie, E.R., Rehmtulla, A., Racie, L.A., DiBlasio, E.A., Ferenz, C., Grant, K.L., Light, A. & McCoy, J.M. (1993) J. Biol. Chem. 268, 23311–

23317.43. Light, A. & Fonseca, P. (1984) J. Biol Chem. 259, 13195–13198.44. Matsushima, M., Ichinose, M., Yahagi, N., Kakei, N., Tsukada, S., Miki, K., Kurokawa, K., Tashiro, K., Shiokawa, K., Shinomiya, K., et al. (1994)

J. Biol Chem. 269, 19976–19982.45. Harris, J.L., Peterson, E.P., Hudig, D., Thornberry, N.A. & Craik, C.S. (1998) J. Biol Chem. 273, 27364–27373.46. Nevins, J.R. (1983) Annu. Rev. Biochem. 52, 441–466.47. Kitamoto, Y., Veile, R.A., Donis-Keller, H. & Sadler, J.E. (1995) Biochemistry 34, 4562–4568.48. Beckmann, G. & Bork, P. (1993) Trends Biochem. Sci. 18, 40–41.49. Emi, M., Nakamura, Y., Ogawa, M., Yamamoto, T., Nishide, T., Mori, T. & Matsubara K. (1986) Gene 41, 305–310.50. Vanderslice, P., Ballinger, S.M., Tam, E.K., Goldstein, S.M., Craik, C.S. & Caughey, G.H. (1990) Proc. Natl. Acad. Sci. USA 87, 3811–3815.51. Tomita, N., Izumoto, Y., Horii, A., Doi, S., Yokouchi, H., Ogawa, M., Mori, T. & Matsubara, K. (1989) Biochem. Biophys. Res. Commun. 158, 569–

575.52. Sudhof, T.C., Goldstein, J.L., Brown, M.S. & Russell, D.W. (1985) Science 228, 815–822.53. Wozney, J.M., Rosen, V., Celeste, A.J., Mitsock, L.M., Whitters, M.J., Kriz, R.W., Hewick, R.M. & Wang, E.A. (1988) Science 242, 1528–1534.54. Leytus, S.P., Kurachi, K., Sakariassen, K.S. & Davie, E.W. (1986) Biochemistry 25, 4855–4863.

REVERSE BIOCHEMISTRY: USE OF MACROMOLECULAR PROTEASE INHIBITORS TO DISSECT COMPLEX BIOLOGICALPROCESSES AND IDENTIFY A MEMBRANE-TYPE SERINE PROTEASE IN EPITHELIAL CANCER AND NORMAL TISSUE

11061

Abou

t thi

s PD

F fil

e: T

his

new

dig

ital r

epre

sent

atio

n of

the

orig

inal

wor

k ha

s be

en re

com

pose

d fro

m X

ML

files

cre

ated

from

the

orig

inal

pap

er b

ook,

not

from

the

orig

inal

type

setti

ng fi

les.

Pag

e br

eaks

are

true

to th

e or

igin

al; l

ine

leng

ths,

wor

d br

eaks

, hea

ding

sty

les,

and

oth

er ty

pese

tting

-spe

cific

form

attin

g, h

owev

er, c

anno

t be

reta

ined

, and

som

e ty

pogr

aphi

c er

rors

may

hav

e be

en a

ccid

enta

lly in

serte

d. P

leas

e us

e th

e pr

int v

ersi

on o

f thi

s pu

blic

atio

n as

the

auth

orita

tive

vers

ion

for a

ttrib

utio

n.