potential aggregation prone regions in biotherapeutics

14
[mAbs 1:3, 254-267; May/June 2009]; ©2009 Landes Bioscience 254 mAbs 2009; Vol. 1 Issue 3 Aggregation of a biotherapeutic is of significant concern and judicious process and formulation development is required to minimize aggregate levels in the final product. Aggregation of a protein in solution is driven by intrinsic and extrinsic factors. In this work we have focused on aggregation as an intrinsic property of the molecule. We have studied the sequences and Fab struc- tures of commercial and non-commercial antibody sequences for their vulnerability towards aggregation by using sequence based computational tools to identify potential aggregation-prone motifs or regions. The mAbs in our dataset contain 2 to 8 aggre- gation-prone motifs per heavy and light chain pair. Some of these motifs are located in variable domains, primarily in CDRs. Most aggregation-prone motifs are rich in β branched aliphatic and aromatic residues. Hydroxyl-containing Ser/Thr residues are also found in several aggregation-prone motifs while charged residues are rare. The motifs found in light chain CDR3 are glutamine (Q)/asparagine (N) rich. These motifs are similar to the reported aggregation promoting regions found in prion and amyloido- genic proteins that are also rich in Q/N, aliphatic and aromatic residues. The implication is that one possible mechanism for aggregation of mAbs may be through formation of cross-β structures and fibrils. Mapping on the available Fab—receptor/ antigen complex structures reveals that these motifs in CDRs might also contribute significantly towards receptor/antigen binding. Our analysis identifies the opportunity and tools for simultaneous optimization of the therapeutic protein sequence for potency and specificity while reducing vulnerability towards aggregation. Introduction Therapeutic monoclonal antibodies (mAb) are a class of medi- cations derived from living organism and produced by means of biologic techniques such as hybridoma, 1 recombinant DNA 2 and phage display. 3 Monoclonal antibodies are highly specific, have high affinity as well as selectivity for binding with thera- peutic targets and exhibit less degree of non-mechanism toxicity. The antibodies can also be exogenously engineered in commer- cially viable manner. The fraction of therapeutic mAbs has been increasing in the research and development portfolios of phar- maceutical industry. So far, more than 20 therapeutic mAbs with various indications have been approved by US Food and Drug Administration (FDA) and hundreds of monoclonal antibodies are currently in various stage of development. 4 Antibodies have been studied for over a century, so the basic structures and functions of antibodies are now well understood. 5-7 Briefly, the building units of antibodies consist of two identical light and two identical heavy polypeptide chains held together by disulfide bridges. Light chains have two isotypes, namely λ and κ, differing in sequence composition. Heavy chains have five isotypes based on chain structure and effector function. All currently approved therapeutic mAbs belong to the IgG class. IgG has the simplest form and is the major immunoglobulin type in human sera. IgGs are further divided into four subclasses, IgG1, IgG2, IgG3 and IgG4. The sequences of light chains and different IgG heavy chains are variable in the N-terminal variable domain and conserved in the remaining constant domain. The variable domains, especially the complementarity determining regions (CDRs), determine the antigen binding specificity. Each IgG subclass has characteristic disulfide bond pattern, differing mostly in the hinge region. The hinge is the most flexible region in a mAb. It connects the two Fab and the Fc domains into ‘Y’ or ‘T’ like overall structures as revealed from the crystal structures for intact murine antibodies. 8,9 Monoclonal antibodies, like other protein therapeutics, are susceptible to degradation at all stages from production to dose *Correspondence to: Sandeep Kumar; Pfizer Global Research and Development; Pharmaceutical R & D; AA324; 700 Chesterfield Parkway West; Chesterfield, MO 63017 USA; Tel.: 636.247.0176; Fax: 636.247.0247; Email: Sandeep.Kumar@ Pfizer.com Submitted: 12/10/08; Accepted: 01/29/09 Previously published online as a mAbs E-publication: http://www.landesbioscience.com/journals/mabs/article/8035 Report Potential aggregation prone regions in biotherapeutics A survey of commercial monoclonal antibodies Xiaoling Wang, Tapan K. Das, Satish K. Singh and Sandeep Kumar* Pharmaceutical R & D; Global Biologics; Pfizer Global Research & Development; Chesterfield, MO USA Abbreviations: mAb, monoclonal antibody; FDA, food and drug administration; CDR, complementarity determining region; Fab, frag- ment, antigen binding; Fc, fragment, crystallize easily; Fv, fragment, variable; V L , variable domain of light chain; V H , variable domain of heavy chain; scFv, single chain Fv; C L , constant domain of light chain; C H , constant domain of heavy chain; PDB, protein data bank; Q/N, glutamine and/or asparagine; QbD, quality by design Key words: monoclonal antibody, aggregation, antibody sequence, aggregation-prone region, aggregation prediction

Upload: national-institute-of-biologicals

Post on 15-Jul-2015

88 views

Category:

Health & Medicine


0 download

TRANSCRIPT

Page 1: Potential aggregation prone regions in biotherapeutics

[mAbs 1:3, 254-267; May/June 2009]; ©2009 Landes Bioscience

254 mAbs 2009; Vol. 1 Issue 3

Aggregation of a biotherapeutic is of significant concern and judicious process and formulation development is required to minimize aggregate levels in the final product. Aggregation of a protein in solution is driven by intrinsic and extrinsic factors. In this work we have focused on aggregation as an intrinsic property of the molecule. We have studied the sequences and Fab struc-tures of commercial and non-commercial antibody sequences for their vulnerability towards aggregation by using sequence based computational tools to identify potential aggregation-prone motifs or regions. The mAbs in our dataset contain 2 to 8 aggre-gation-prone motifs per heavy and light chain pair. Some of these motifs are located in variable domains, primarily in CDRs. Most aggregation-prone motifs are rich in β branched aliphatic and aromatic residues. Hydroxyl-containing Ser/Thr residues are also found in several aggregation-prone motifs while charged residues are rare. The motifs found in light chain CDR3 are glutamine (Q)/asparagine (N) rich. These motifs are similar to the reported aggregation promoting regions found in prion and amyloido-genic proteins that are also rich in Q/N, aliphatic and aromatic residues. The implication is that one possible mechanism for aggregation of mAbs may be through formation of cross-β structures and fibrils. Mapping on the available Fab—receptor/antigen complex structures reveals that these motifs in CDRs might also contribute significantly towards receptor/antigen binding. Our analysis identifies the opportunity and tools for simultaneous optimization of the therapeutic protein sequence for potency and specificity while reducing vulnerability towards aggregation.

Introduction

Therapeutic monoclonal antibodies (mAb) are a class of medi-cations derived from living organism and produced by means of biologic techniques such as hybridoma,1 recombinant DNA2 and phage display.3 Monoclonal antibodies are highly specific, have high affinity as well as selectivity for binding with thera-peutic targets and exhibit less degree of non-mechanism toxicity. The antibodies can also be exogenously engineered in commer-cially viable manner. The fraction of therapeutic mAbs has been increasing in the research and development portfolios of phar-maceutical industry. So far, more than 20 therapeutic mAbs with various indications have been approved by US Food and Drug Administration (FDA) and hundreds of monoclonal antibodies are currently in various stage of development.4

Antibodies have been studied for over a century, so the basic structures and functions of antibodies are now well understood.5-7 Briefly, the building units of antibodies consist of two identical light and two identical heavy polypeptide chains held together by disulfide bridges. Light chains have two isotypes, namely λ and κ, differing in sequence composition. Heavy chains have five isotypes based on chain structure and effector function. All currently approved therapeutic mAbs belong to the IgG class. IgG has the simplest form and is the major immunoglobulin type in human sera. IgGs are further divided into four subclasses, IgG1, IgG2, IgG3 and IgG4. The sequences of light chains and different IgG heavy chains are variable in the N-terminal variable domain and conserved in the remaining constant domain. The variable domains, especially the complementarity determining regions (CDRs), determine the antigen binding specificity. Each IgG subclass has characteristic disulfide bond pattern, differing mostly in the hinge region. The hinge is the most flexible region in a mAb. It connects the two Fab and the Fc domains into ‘Y’ or ‘T’ like overall structures as revealed from the crystal structures for intact murine antibodies.8,9

Monoclonal antibodies, like other protein therapeutics, are susceptible to degradation at all stages from production to dose

*Correspondence to: Sandeep Kumar; Pfizer Global Research and Development; Pharmaceutical R & D; AA324; 700 Chesterfield Parkway West; Chesterfield, MO 63017 USA; Tel.: 636.247.0176; Fax: 636.247.0247; Email: [email protected]

Submitted: 12/10/08; Accepted: 01/29/09

Previously published online as a mAbs E-publication: http://www.landesbioscience.com/journals/mabs/article/8035

Report

Potential aggregation prone regions in biotherapeuticsA survey of commercial monoclonal antibodies

Xiaoling Wang, Tapan K. Das, Satish K. Singh and Sandeep Kumar*

Pharmaceutical R & D; Global Biologics; Pfizer Global Research & Development; Chesterfield, MO USA

Abbreviations: mAb, monoclonal antibody; FDA, food and drug administration; CDR, complementarity determining region; Fab, frag-ment, antigen binding; Fc, fragment, crystallize easily; Fv, fragment, variable; VL, variable domain of light chain; VH, variable domain of heavy chain; scFv, single chain Fv; CL, constant domain of light chain; CH, constant domain of heavy chain; PDB, protein data bank; Q/N, glutamine and/or asparagine; QbD, quality by design

Key words: monoclonal antibody, aggregation, antibody sequence, aggregation-prone region, aggregation prediction

Page 2: Potential aggregation prone regions in biotherapeutics

www.landesbioscience.com mAbs 255

administration. Several chemical and physical pathways, such as deamidation, oxidation, hydrolysis/fragmentation, isomeriza-tion and aggregation, are responsible for the degradation of the proteins.10 Among these, aggregation is the least understood degra-dation route for mAbs.

Aggregation is widely seen in proteins and refers to self- association of a number of protein molecules that may form visible or invisible particles, precipitates or fibrils.11,12 Self-association may or may not be preceded by adoption of non-native conforma-tions by the protein or parts thereof. Aggregation is an important factor in the development of a biotherapeutic.13-15 The primary concern around aggregation is the potential to enhance immune response, which presents a risk-factor in the development of these therapeutics.16-18 If aggregation in the liquid state can not be controlled, the product may need to be lyophilized, increasing the cost of production. Aggregation problems are further exacerbated when the mAb needs to be formulated at high concentrations for subcutaneous route of administration.19,20 Hence, scientists engaged in product development and formulation of the biologic drug products must discover ways to counter these problems. In the absence of a molecular-level understanding of the aggregation mechanism, work in preventing aggregation has been mainly based on experimental heuristic screening, and dependent on particulars of each case.

Protein misfolding and aggregation is an intrinsic property that underlies a variety of neurodegenerative disorders such as Alzheimer and Parkinson disease.21-23 A hallmark of these diseases is aggregation by formation of fibrillar amyloid deposits. Through extensive research, sequence motifs that lead to this pathogenesis have been identified. Computational tools have been developed to analyze protein sequences and predict the propensity for such aggregate formation. Two such sequence-based prediction tools are TANGO24 and PAGE.25

We believe that the intrinsic (in)stabilities of mAbs, as governed by the sequence-structure-function paradigm, are also coded in their sequences. In this report, we have collected the sequences of commercial mAbs and performed sequence analysis in a system-atic manner. We used sequence-based prediction tools (TANGO and PAGE) to identify the potential aggregation-prone regions in commercial mAb sequences. Our analyses reveal several common features among the aggregation-prone motifs found in both vari-able and constant domains. The aggregation-prone motifs in variable domains are primarily located in the CDR loops, the regions that bind directly with the cognate receptor or antigen. Almost all aggregation-prone motifs are rich in hydrophobic or aromatic residues. Hydroxyl group containing residues, Ser and Thr, are also found in several aggregation-prone motifs. However, charged residues (Asp, Glu, Lys, Arg and His) are rare in such regions. The aggregation-prone motifs found in light chain CDR3 regions are glutamine (Q)/asparagine (N) rich. Sequence motifs rich in Q/N, hydrophobic or aromatic residues have been previ-ously implicated in aggregation of prion or amyloidal proteins.26-28 We also performed identical analyses on a set of non-commercial antibody sequences and obtained similar results. The importance of location of aggregation-prone regions on 3D structure is also

addressed. Mapping of the aggregation-prone motifs on the avail-able Fab—receptor/antigen complex structures revealed that, when present in mAb CDRs, these motifs might also contribute towards receptor/antigen binding. This poses an interesting challenge to antibody engineers: how to minimize the vulnerability of the mole-cule towards physical (or chemical) degradation while still retaining high potency and specificity towards the cognate targets?

We must also point out that our analysis delves into only one part of the complicated phenomena of aggregation. Apart from the intrinsic factors, there are a whole host of extrinsic factors that together ultimately determine the levels of aggregation seen in a product. Furthermore, judicious process and formulation devel-opment diminishes the impact of intrinsic factors and attenuates the impact of extrinsic factors, as evidenced by the number of successful commercial mAb products that have had a significant impact on human health over the last two decades.

Therapeutic Monoclonal Antibodies in the Market

Since the launch of muromonab-CD3 (Orthoclone OKT3) in 1986, therapeutic monoclonal antibodies have evolved from the early murine origin, through chimeric and humanized, to fully human form in little over 20 years. All 22 FDA approved commer-cial therapeutic monoclonal antibodies, as well as two polyclonal antibody fragments and a diagnostic mAb approved by FDA, are listed in Table 1. Indications, formulations and marketing for these antibodies are summarized in refs. 29–31. The increase in the number of therapeutic antibody drug products over time mirrors the advances in antibody production techniques.32-34 The first therapeutic candidates were murine antibodies produced by hybridoma technique,1 but use of these mAbs was limited due to their immunogenic complications in human patients. Patients produced human anti-mouse antibody (HAMA) response as the murine antibody was 100% foreign protein.35 Due to these concerns, production of murine mAbs acrituomab (CEA-scan, www.immunomedics.com), fanolesomab (Neutrospec, www.palatin.com), imciromab pentetate (Myoscint, www.fda.gov/cder), satumomab pendetide (Oncoscint, http://nuclearpharmacy.uams.edu/resources/PackageInserts.asp), and nofetumomab (Verluma, http://nuclearpharmacy.uams.edu/resources/PackageInserts.asp), was discontinued.

Subsequent generations of therapeutic antibodies have been re-engineered to reduce mouse origin content in order to reduce immunogenicity. In the mid-1980s, simple chimeric mouse-human antibodies42-44 were made using recombinant DNA techniques.2 Chimeric antibodies consist of variable domain of mouse antibody and constant domains of human antibody. The mouse content is reduced to approximately 33% in chimeric mAbs. Two chimeric mAbs, rituximab (Rituxan) and infliximab (Remicade) are among the five top selling antibodies today. Antibodies were further humanized by grafting only murine CDR regions into the human antibody framework.45,46 Humanized anti-bodies have 5–10% of mouse content. Following the approval of the first humanized therapeutic antibody (daclizumab [Zenapax]) in 1997, humanized antibodies trastuzumab (Herceptin) and bevacizumab (Avastin) have gone on to achieve blockbuster status.

Aggregation prone regions in commercial mAbs

Page 3: Potential aggregation prone regions in biotherapeutics

Aggregation prone regions in commercial mAbs

256 mAbs 2009; Vol. 1 Issue 3

Most commercial antibodies are intact conventional mAbs. However, for enhanced affinity, reduced toxicity and improved tissue permeation, antibody fragments such as Fab, Fv and single chain Fv (scFv) can be used. The Fab products are currently most common. Marketed Fabs (Table 1) are ranibizumab (Lucentis),

The current mAb market is dominated by chimeric and human-ized antibodies. However, fully human antibodies generated using phage display technique3,47,48 and transgenic animals49-51 are reaching the market. Adalimumab (Humira; launched in 2002) is one such example.

Table 1 List of commercial mAbs

# Brand name Molecule name Isotype and format Source Source of sequence Approval year Company1 Avastin Bevacizumab IgG1 κ Humanized CAS ID: 216974-75-3 2004 Genentech and BioOncology2 Bexxar Tositumomab and IgG2a λ Murine 2003 Corixa and GSK I-131Tositumab Conjugated to Iodine3 Campath Alemtuzumab IgG1 κ Humanized DB ID: DB00087 2001 Ilex Oncology; Millenium and Berlex4 Cimzia Certolizumab IgG1 K; Fab; Humanized CAS ID: 428863-50-7 2008 Nektar/UCB pegol Conjugated to Polyethylene glycol5 CroFab Crotailidae Fab Ovine 2002 Protherics Inc Polyvalent Immune Fab6 DigiFab Digoxin Immune IgG2a κ; Fab Ovine DB ID: DB00076 2001 Protherics Inc Fab PDB ID: 1IGJ7 Erbitux Cetuximab IgG1 κ Chimeric DB ID: DB00002 2004 ImClone and BMS8 Herceptin Trastuzumab IgG1 κ Humanized DB ID: DB00072 1998 Genentech9 Humira Adalimumab IgG1 κ Human CAS ID: 331731-18-1 2002 CAT and Abbott10 Lucentis Ranibizumab IgG1 κ; Fab Humanized CAS ID: 347396-82-1 2006 Genentech11 Mylotarg Gemtuzumab IgG4 κ Humanized ref. 36, Fv only 2000 Celltech and Wyeth ozogamicin Conjugated to Chimeric Calicheamicin12 Orthoclone Muromonab-CD3 IgG2a κ Murine DB ID: DB00075 1986 Ortho Biotech OKT13 ProstaScint Indium-111 IgG1 κ Murine 1996 Cytogen Capromab Conjugated pendetide to Indium14 Raptiva Efalizumab IgG1 κ Humanized CAS ID: 214745-43-4 2003 Genentech ref. 37 and Xoma15 Remicade Infliximab IgG1 κ Chimeric ref. 38 1998 Centocor16 ReoPro Abciximab 1gG1; Fab Chimeric 1994 Centocor/Lilly17 Rituxan Rituximab IgG1 κ Chimeric DB ID: DB00073 1997 Genentech and IDEC18 Simulect Basiliximab IgG1 κ Chimeric DB ID: DB00074 1998 Novartis19 Soliris Eculizumab IgG2/4 κ Humanized CAS ID: 219685-50-5 2007 Alexion ref. 39, 40 Pharmaceuticals20 Synagis Palivizumab IgG1 κ Humanized PDB ID: 2HWZ ref. 41 1998 MedImmune21 Tysabri Natalizumab IgG4 κ Humanized CAS ID: 189261-10-7 2004 Elan and Biogen Idec22 Vectibix Panitumumab IgG2 κ Human CAS ID: 339177-26-3 2006 Amgen23 Xolair Omalizumab IgG1 κ Humanized CAS ID: 242138-07-4 2003 Genentech w Novartis24 Zenapax Daclizumab IgG1 κ Humanized CAS ID: 152923-56-3 1997 Roche25 Zevalin Ibritumomab IgG1 κ Murine DB ID: DB00078 2002 IDEC Conjugated to Indium or Yttrium

Page 4: Potential aggregation prone regions in biotherapeutics

www.landesbioscience.com mAbs 257

Aggregation prone regions in commercial mAbs

(Simulect) and rituximab (Rituxan). Their aligned scores are close to 90%. The murine mAb muromonab-CD3 (Orthoclone OKT) and the ovine mAb digoxin immune Fab (DigiFab) share lowest sequence similarities with other mAbs, as indicated by the low aligned scores of ~60%.

Figure 2A shows the heavy chain sequence alignment for commercial mAbs. Similar to light chains, sequences in constant regions for mAbs of the same isotype are very similar, if not iden-tical. Variability comes from VH, especially CDRs. In the IgG1 group, overall sequence identities of human and humanized mAbs (adalimumab [Humira], bevacizumab [Avastin], alemtuzumab [Campath], trastuzumab [Herceptin], efalizumab [Raptiva], omal-izumab [Xolair], daclizumab [Zenapax], certolizumab [Cimzia Fab] and ranibizumab [Lucentis Fab]) are greater than 85%, while the VH sequence aligned scores are ~67%. The same applies to the chimeric IgG1 mAbs cetuximab (Erbitux), basiliximab (Simulect) and rituximab (Rituxan). These mAbs share 85% overall sequence identities, but 54% identities for VH sequence only. The human-ized mAb palivizumab (Synagis) appears to be an exception as its aligned scores with other human or humanized IgG1 mAbs are below 70% when only Fab portions of the sequences are aligned (note that only Fab sequence is available for palivizumab). Ibritumomab (Zevalin) is the only IgG1 murine mAb in Figure 2A. Its sequence similarities with others are lower than 60%.

The hinge region is important for flexibility and effector func-tions of the IgG molecules.55 Each IgG subclass has characteristic sequence composition and disulfide bonding pattern in the hinge region.62,63 All the IgG1 commercial mAbs, except ibritumomab (Zevalin), have the characteristic IgG1 hinge with three cysteine residues (Fig. 2A). The first cysteine is disulfide bonded with the C-terminal cysteine in light chain. The other two cysteines form inter-heavy chain disulfide bridges. The constant region of eculi-zumab (Soliris) is the hybrid of IgG2 (CH1 domain and hinge region) and IgG4 (CH2 and CH3 domain).64 Both eculizumab (Soliris) and panitumumab (Vectibix) have the typical IgG2 hinge. However, the hinge sequence for panitumumab as published in CAS registry (CAS ID: 339177-26-3) does not show an IgG2 characteristic pattern. In Figure 2A, the sequence of panitumumab (Vectibix) is shown after replacement with a canonical IgG2 hinge. Muromonab-CD3 (Orthoclone OKT) is another outlier as an IgG2 mAb with an IgG1 hinge. Natalizumab (Tysabri) has the IgG4 hinge in which the four cysteines form inter-heavy chain disulfide bonds.

Sequence alignments for non-commercial antibodies. To compare our data set of commercial mAbs, we have randomly collected 20 non-commercial antibody sequences. The crystal structures for all of these are available in the protein data bank (PDB).54 Three of the 20 are the sequences for intact mAbs (PDB entries: 1IGT, 1IGY, 1HZH), seven are antigen-antibody (Fab) complexes and the remaining ten are Fabs only. We have done the same analyses on these 20 antibody sequences. Their light and heavy chain sequence alignments are presented as Figures 1B and 2B to facilitate the direct comparison with the commercial mAbs.

abciximab (ReoPro), certolizumab (Cimzia), digoxin immune Fab (DigiFab) and crotailidae polyvalent immune Fab (CroFab). The latest technologies for the design of high affinity antibodies is reviewed by Wark et al.52

Sequence Differences and Similarities

Sequence availability for commercial mAbs. We have collected the sequences of commercial mAbs in as complete a manner as feasible to perform sequence analysis. The sources of sequences are Drug Bank database (www.drugbank.ca)53 and CAS registry (www.cas.org). Some partial sequences are collected from RSCB Protein Data Bank (www.rcsb.org/pdb/home/)54 and literature. The Drug Bank ID, CAS ID, PDB ID, and references are listed in Table 1.

The sequences of commercial mAbs are not all publically available. For example, we could not find the sequences of tosi-tumomab (Bexxar), abciximab (ReoPro), capromab pendetide (ProstaScint) and crotailidae polyvalent immune Fab (CroFab) in the public domain. In a few other cases, only partial sequences are publicly available. The sequences of only the Fab and Fv regions are available for palivizumab (Synagis) and gemtuzumab ozogamicin (Mylotarg), respectively. Fv sequence of infliximab (Remicade) has been published,38 but the CDR regions are missing. Thus, infliximab (Remicade) was not included in subsequent sequence analysis.

Sequence isotypes of the commercial mAbs. Table 1 summa-rizes the brand name, generic name, isotype, source, year of first FDA approval, and the manufacturer for each antibody included in this study. The heavy chains are dominantly IgG1 with only a few of the IgG2 or IgG4 type. The choice of IgG isotype commonly depends on the requirement for complement binding and effector function.55,56 IgG1 has a flexible hinge and good Fc-mediated effector activity, while IgG2 has limited effector activity. IgG2 is commonly chosen when only the binding specificity is desired.56,57 However, mAbs of IgG2 type have been reported to show disulfide scrambling.58-60 Disulfide scrambling affects the hinge structure in IgG2 mAbs and likely leads to product heterogeneity.

Most light chains are of κ type. Tositumomab (Bexxar) is the only product with λ light chain. The light chain type for abcix-imab (ReoPro) and crotailidae polyvalent immune Fab (CroFab) is unknown from public sources. κ light chains have a high abun-dance in mice (95%) but much lower (60%) in humans.61

Sequence alignments for commercial mAbs. The multiple sequence alignment of light chains for commercial mAbs is presented in Figure 1A. All the light chains shown in Figure 1A are κ chains. The degree of sequence identity is consistent with the sequence origin, i.e., source organism. The sequences of human and humanized mAbs have very high identities. The pair-wise aligned scores are as high as 90%. As expected, the differences primarily lie in CDR regions. The aligned scores are reduced to 80% if only VL sequences are considered. The aligned scores between the human or humanized and the chimeric mAbs are 75–80%. Even though ibritumomab (Zevalin) is a murine mAb, its light chain is very similar to the two chimeric mAbs, basiliximab

Page 5: Potential aggregation prone regions in biotherapeutics

Aggregation prone regions in commercial mAbs

258 mAbs 2009; Vol. 1 Issue 3

Figure 1. For figure legend, see page 259.

Page 6: Potential aggregation prone regions in biotherapeutics

www.landesbioscience.com mAbs 259

Aggregation prone regions in commercial mAbs

Figure 1. (A) Sequence alignment of light chains of commercial mAbs. The letters at the end of the mAb brand name are used to indicate its source. ‘u’ for human, ‘zu’ for humanized, ‘xi’ for chimeric, ‘o’ for mouse and ‘ov’ for sheep mAb. For molecular name of each mAb, refer to Table 1. All cysteine amino acids are highlighted in green. The three CDRs are highlighted in yellow. The predicted aggregation-prone segments are in red letters. (B) Sequence alignment of light chains of the twenty randomly selected non-commercial mAbs. Highlighting has the same meaning as in (A).

Figure 2A. For figure legend, see page 260.

Page 7: Potential aggregation prone regions in biotherapeutics

Aggregation prone regions in commercial mAbs

260 mAbs 2009; Vol. 1 Issue 3

proteins, especially ones with amyloidogenic properties, often contain specific sequence or structural motifs that tend to promote aggregation.26,65 Knowledge of the aggregation-prone regions in proteins may help alleviate aggregation concerns in biotherapeutics via rational design and/or selection of robust yet potent candidates, thus, bringing Quality by Design (QbD) approaches to the early stages of drug discovery and development.

Several molecular seeding conformations may give rise to aggre-gates. One of the best-studied conformational seed is the cross-β motif seen in amyloid-like fibrils.66,67 While the presence of β- strands is not essential to form this motif, several β-rich proteins (e.g., β2-microglobulin, transthyretin, immunoglobulin light chain especially of type λ, immunoglobulin heavy chain, concavalin A) form amyloids using the cross-β motif as seed.68-73 In addition, Siepen et al. have shown that the exposed edge β-strands in

Prediction of potential aggregation-prone regions

Commercial mAbs are subjected to numerous stress factors, such as pH changes, temperature excursions, chemical modification, shear and freeze/thaw, at various stages of production, shipping and storage. These stress factors may trigger self-association of the mAb molecules leading to aggregation. Currently, the process by which protein molecules self associate to seed aggre-gation and the total number of variables involved are not well-understood. Does molecular association require a confor-mational change from the native state? If so, what degree of conformational change is required? Which regions of the molecule are more conformationally labile than others? Are these confor-mationally labile regions more likely to drive self-association and promote aggregation? Accumulating experimental data shows that

Figure 2A. Sequence alignment of heavy chains of commercial mAbs. ‘G1’ or ‘G2’ after the mAb brand name are for IgG1 and IgG2, respectively. For molecular name of each mAb, refer to Table 1. Highlighting scheme is same as Figure 1. In addition, the generic hinge region is highlighted in blue. Cysteine amino acids in hinge are highlighted in red. Note the hinge region of Vectibix is shown with a canonical IgG2 hinge. The mAbs except Remicade are grouped according to isotype and ordered based on degree of humanization within each group.

Figure 2B. Sequence alignment of light chains of the twenty randomly selected non-commercial mAbs. Highlighting has the same meaning as in (A).

Page 8: Potential aggregation prone regions in biotherapeutics

www.landesbioscience.com mAbs 261

Aggregation prone regions in commercial mAbs

peptides from 16 proteins including α-synuclein, apolioprotein, APP, IAPP, prion, Sup35 and tau.25

We deliberately chose to use at least two different aggregation-prone region prediction tools. This approach has helped us avoid potential biases due to the training sets, parameterization and peculiarities of methods in a given tool. Even though the authors of TANGO suggest a region is potentially aggregation-prone if the TANGO scores of more than five consecutive residues are above 5%,24 we use a much more stringent criteria of ≥50%. The aggregation propensity (lnp) from PAGE is converted to Z score in order to identify the regions with statistically high aggregation propensity. The Z score of residue i is calculated as follows,

(1)

where, is the average aggregation propensity of the sequence, std(lnπ) is the standard deviation. A region with Z score >1.96 is considered as aggregation-prone. We identify a region as aggrega-tion prone if it is strongly predicted by at least one program. Use of stringent criteria ensures that our predicted regions have a greater probability of being truly aggregation-prone. However, we must note that our choice is rather arbitrary, and additional aggregation-prone motifs could have been identified with less stringent criteria. The real test lies in experimental confirmation. Experiments on polypeptides containing the predicted aggregation-prone regions are planned.

We have performed TANGO and PAGE analysis on all the collected commercial mAbs and the 20 non-commercial anti-body sequences. The TANGO and PAGE profiles of trastuzumab

proteins can initiate amyloidogenesis by forming the cross β-motifs.74 For example, the recognition between exposed edge β-strands is an important step in amyloid fibrilogen-esis for transthyretin and β2-microglobulin.74 In another example, Pande et al. has proposed that aggregation by P23T mutant of human γD-Crystallin may be via this mechansim.75 Moreover, work from Dobson’s group has shown that fibrilogenesis is a general feature of proteins under stress or denaturating conditions.76 Monoclonal antibodies are also β-sheet rich proteins. Hence, we were curious to explore if mAbs can also potentially aggregate via the cross β-motif conformational seed. If so, the sequence motifs prone to this type of aggregation should be found in both the commercial and other mAbs. The paper by Maas et al.17 is particularly relevant in this regard. Using the amyloid-marker dyes Congo Red and Thioflavin T, they detected formation of cross-β struc-tures and amyloid-like fibrils in many biopharmaceuticals, including two antibodies which nevertheless had among the lowest signal levels. Besides this, we also have evidence for Thioflavin T dye binding in aggregated mAb samples from our own work in progress. This work is our first attempt at identifying aggregation-prone motifs in thera-peutic mAb sequences by taking advantage of the tools developed for amyloidogenic proteins. The presence of just one aggregation-prone motif could drive cross-β motif type aggre-gation. However, the occurrence of aggregation-prone sequences in mAbs does not guarantee that aggregation shall necessarily proceed by this mechanism. This may or may not be the significant mode of aggregation in therapeutic mAbs.

Prediction methods. A number of computational approaches, such as TANGO,24 PAGE25 and Zyggregator,77 have been devel-oped to predict the aggregation-prone regions in amyloidogenic proteins. We have chosen two prediction tools, viz. TANGO and PAGE, to identify potential aggregation-prone regions in commer-cial mAb sequences. TANGO and PAGE have been parameterized based on the experimental data on amyloidogenic peptides and proteins. However, the two tools use different algorithms.

TANGO and PAGE, developed by Serrano’s and Caflisch’ group respectively, use the primary amino acid sequence only and make predictions based on the physicochemical properties of amino acids. The TANGO algorithm relies on physicochemical rules behind β-sheet formation and assumes that the core of an aggregate is completely desolvated. TANGO calculates relative probabilities of peptide segments to exist in different conformational states such as a-helix, b-strand, turn, random coil and b-aggregate.24 The TANGO algorithm was tested against 179 peptides derived from 21 different proteins as published.24 TANGO yielded a success rate of 92% for aggregation tendency of 5% or greater.

PAGE computes the aggregation propensity (ln�) and absolute aggregation rate (lnR) for a given amino acid sequence by sliding a small window of 5–9 residues along the sequence. The aggregation propensity is calculated based on aromaticity, b-strand propensity, charge, solubility, average polar/non-polar/accessible surface area of each residue in the given widow.25 PAGE was tested against

Figure 3. The TANGO and PAGE profiles for trastuzumab (Herceptin) mAb. (A) Light chain profile. X-axis shows sequence number. Left y-axis and blue curve are for PAGE Z score. Right y-axis and black curve are for TANGO aggregation per-centage. The red horizontal line indicates Z score = 1.96. (B) Heavy chain profile. Axes and line styles have same meaning as in (A).

Page 9: Potential aggregation prone regions in biotherapeutics

Aggregation prone regions in commercial mAbs

262 mAbs 2009; Vol. 1 Issue 3

contains 2–8 aggregation-prone motifs per light and heavy chain pair. Moreover, the locations of all the aggregation-prone motifs are consistent across the alignment (Figs. 1A, 2A and Table 2). The aggregation-prone motifs in the constant regions are nearly identical due to greater conservation in these regions (Table 2). The aggregation-prone regions in variable domains are primarily located in CDRs and in the adjoining framework β-strands. The individual motifs in the variable domains show greater sequence variation (Table 2). At this time, however, it is unknown if the number of aggregation-prone motifs in a mAb is directly correlated with the rate or degree of its aggregation. It is also not clear if any of the above mentioned motifs is a stronger aggregation driver than the others.

The degree of humanization in these commercial mAbs may not be related to fewer aggregation prone motifs. Apart from alem-tuzumab (Campath) and daclizumab (Zenapax), the heavy chains of human or humanized IgG1 mAbs do not contain aggregation-prone motifs in their variable regions (Fig. 2A). The constant regions of heavy chains as well as the variable and constant regions in light chains (Fig. 1A) do not show fewer aggregation-prone motifs in fully human or humanized IgG1 mAbs. This is not

(Herceptin) are shown in Figure 3 as an example. The aggregation-prone regions obtained from this analysis are highlighted in the multiple sequence alignments shown in Figures 1 and 2. The location and composition of aggregation-prone regions for these non-commercial antibodies are similar to those from commercial mAbs. Table 2 provides a list of aggregation-prone motifs found in commercial mAbs sorted according to their location.

To further validate our approach, we have also performed TANGO and PAGE analysis on the biopharmaceutical proteins used in the work of Maas et al.17 The criteria used here were the same as those for the commercial mAbs and non-commercial antibodies. Out of the 11 biopharmaceuticals studied by Maas et al. (Table 1 in ref. 17), the amino acid sequences for nine are available in the public domain (www.drugbank.ca or CAS registry). Consistent with the experimental results,17 TANGO and PAGE analyses have identified aggregation-prone motifs in all biopharmaceuticals (Table 3). Furthermore, the aggregation-prone motif 14-ALYLV-18 (Table 3) coincides with the experimentally proven fibril-forming segment (12-VEALYL-17) of insulin.78

Figures 1A and 2A indicate that aggregation-prone motifs are found in all domains of the antibodies except CH1. Each antibody

Table 2 Summary of aggregation-prone motifs in commercial mAbs

# General location and Aggregation- Occurrence # General location and Aggregation- Occurrence sequence positiona prone motifs in mAbsb sequence positiona prone motifs in mAbs1 Precede and partly LLIYSASFLY 7 9 Right after CDR H2 VTMLV 1 in CDR L2 LLIYAASYL- Residue 72–76 Residue 51–60 LLIYGA---- LLIYAA---- VLIYF-----2 Right before CDR L3 LGIYF 1 10 Precede CDR H3 IYYCV 1 Residue 88–92 Residue 97–1013 Precede and mostly YCQQNNN 8 11 In CDR H3 VVYYSNSYWYF--- 5 in CDR L3 YCQQHNE Residue 104–118 - Residue 92–98 YCQQYS- IFYFYGTTY-F--- YCQQYN- - YCQQHY- -----DDHYC---- YCLQYD- - YCQQS-- --------- FAVWG4 Beginning of SVFIFP 17 12 In CH2 SVFLFPP 15 CL domain SVFIF- Residue 254–259 SVFLFP- Residue 119–124 TVFIFP SVFIFP-5 In CL domain VVCLL 19 13 In CH2 VLMISL 1 Residue 137–141 VVCFL Residue 265–2706 In CDR H1 YIFSNYWIQWV 3 14 In CH2 IVTCVVV 1 Residue 27–39 -IFTDF----- Residue 273–279 -----YYWTWI7 Precede and partly IGAIY 6 15 End of CH2 VVSVLTVL 14 in CDR H2 LGVIW Residue 317–324 VVSVLTVV Residue 50–54 IGYIS IGYIY8 In CDR H2 TTEYN- 3 16 In CH3 GSFFLYS 14 Residue 61–66 -TEYNQ Residue 417–423 GSFFLY- -TNYNQ -SFFLYS -SFFLY-

aContinuous numbering in the aligned sequences as shown in Figures 1 and 2. is used for residue position. bThis column records the occurrences of the corresponding motifs in the mAbs in Figures 1 and 2.

Page 10: Potential aggregation prone regions in biotherapeutics

www.landesbioscience.com mAbs 263

Aggregation prone regions in commercial mAbs

tracts is associated with at least nine different neurodegenerative diseases, including Huntington’s disease.84-86 This high aggregation propensity of Q/N rich sequences is explained by the ‘polar-zipper’ model proposed by Perutz and coworkers based on their structural studies.87 A number of computational studies have also provided insights on the aggregation mechanism of Q/N rich sequences.88-90 The crystal structures of two short peptides NNQQNY and GNNQQNY have been solved.66 These structures elucidated a ‘steric zipper’ motif implicated in fibril formation. This work led to development of a computational method to detect similar short peptide sequences.78 This work and other computational model of amyloid structures were also analyzed by Zhang et al.67 They have outlined three consensus features of amyloid organization, namely, (1) polar or non-polar VDW zippers; (2) in-register β-strands; (3) twisted β-sheets.

Interestingly, several motifs predicted to be aggregation-prone are conserved in the constant domains of almost all mAbs. SVFIFP in CL, SVFLFP in CH2 and GSFFLYS in CH3 are phenylalanine rich segments (motif 4, 12 and 16 in Table 2). Phenylalanine, an aromatic amino acid, is found to play an important role in directing molecular recognition and self-assembly.28 Many experi-mentally studied β-aggregating segments contain phenylalanine residues. Some examples are LVFFA in amyloid precursor protein (APP) and RGFFY in islet amyloid polypeptide (IAPP).25 Tyrosine and tryptophan also have effects similar to phenylalanine. Jones et al. have found that the aromatic residue rich peptides from β2-Microglobulin readily form fibrils in vitro.68 The motifs 6 (YIFSNYWIQWV and YYWTWI in Heavy Chain CDR 1) and 11 (VVYYSNSYWYF and IFYFYGTTYF in Heavy chain CDR 3) are rich in aromatic residues (Table 2). The role of aromatic residues in promoting aggregation is through the favorable π-stacking interactions; Phe especially is found to form significant homostacking.88 These interactions lead to aggregates with the specific ordered pattern seen in fibrils.25,68

surprising because the goal of humanization is to reduce immuno-genicity due to foreign origin.

In the antibody structure, CDRs form binding sites and are exposed on the surface of the Fab globules to facilitate antigen/receptor/hapten recognition. In absence of the cognate partners, the CDRs may promote self-assembly due to their conformational flexibility and solvent accessibility. However, it is not yet clear to us if the mAb self-association involves Fab: Fab or Fab: Fc association. In theory, both of these association modes are possible.

In general, the aggregation-prone motifs avoid charged residues (Asp, Glu, Lys, Arg and His). This feature is consistent with the fact that charged residues contribute favorably towards protein solubility.79 The net charge of an unfolded polypeptide chain has been reported to be inversely correlated with its aggregation propensity.80 Surprisingly, the hydrophilic hydroxyl-containing Ser/Thr residues are found in several motifs (Table 2), probably due to their high propensity to form β-strand and low propensity to form α-helix.81 More interestingly, almost all aggregation-prone motifs are rich in aromatic and hydrophobic β branched residues. The aggregation-prone motifs found in several light chain CDR3 regions are rich in Gln and Asn residues. Those motifs are worth further discussion as they have also been implicated in driving aggregation of proteins involved in neurodegenerative diseases.

The aggregation-prone motifs in the third CDR of light chains (YCQQNNN in cetuximab [Erbitux], YCQQHNE in efalizumab [Raptiva]; [motif 3 in Table 2]) are rich in Gln and Asn. These motifs are similar to the aggregation-seeding motifs YQQYN, YQNYQ and YQQQY found in the well-known yeast prion proteins sup35,27 and Ure2p,82 where the Q/N rich segments seed self-association. High Q/N content domains have strong propen-sity to form amyloid fibrils.83 The high aggregation propensity of Q/N rich sequences is also manifested in the mutational expan-sion of CAG codons which leads to the elongation of the encoded polyglutamine sequence. Aggregation of expanded polyglutamine

Table 3 Summary of aggregation-prone motifs in biopharmaceuticals

Therapeutic proteina Source of sequenceb Aggregation-prone motifc

Albumin DB00096 21 ALVLIAFA 153 ELLFFAK 325 VFLGMFLY 341 YSVVLLL 551 FAAFVSomatropin DB00052 78 ISLLLIQ 101 LVYGA 161 GLLYCInsulin aspart CAS 116094-23-6 14 ALYLVFactor VIII DB00025 50 LFVEF 58 LFNIA 84 VVITL 136 YVWQVL 172 LIGALLV 195 FILFAVF 269 IFLFG 303 LGQFLLFC 460 TLLIIFK 572 NVILFSVF 634 VAYWYILFIG 648 FLSVFFSG 790 LLMLL 1226 NLFLLS 1445 LAILT 1698 YFIAAV 1756 LGLLG 1771 IMVTF 1875 FALFFTIF 2012 TLFLVY 2096 YISQFIIMY 2123 LMVFFGN 2271 VTLFFEpoetin Alfa DB00016 65 GLALL 78 ALLVNEtanercept DB00005 259 SVFLFP 322 VVSVLTVL 422 GSFFLGlucagon DB00040 22 FVQWLM

aList of proteins is from Table 1 of ref. 17. bSequences are obtained from either Drugbank (www.drugbank.ca) or CAS registry. Listed are entry ID. cNumber before each motif is the position of the first residue of the motif.

Page 11: Potential aggregation prone regions in biotherapeutics

Aggregation prone regions in commercial mAbs

264 mAbs 2009; Vol. 1 Issue 3

3D Structures of Commercialized mAbs

IgG mAbs have very similar structure, which is in overall ‘Y’ or ‘T’ shape with Fab and Fc domains adopting the immunoglobulin fold. Currently, it is difficult to obtain crystal structures for intact antibodies. So far only three crystal structures of intact antibodies have been solved with moderate resolution.8,9,95 However, the Fab or Fc domains of antibodies are relatively easy to crystallize and many structures have been deposited in the Protein Data Bank. The Fab structures of some commercial mAbs are available, namely, bevacizumab (Avastin; PDB ID: 1BJ1),96 alemtuzumab (Campath; 1CE1),97 cetuximab (Erbitux; 1YY8 and 1YY9),98 tras-tuzumab (Herceptin; 1N8Z),99 muromonab-CD3 (Orthoclone OKT; 1SY6),100 basiliximab (Simulect; 1MIM),101 palivizumab (Synagis; 2HWZ) and digoxin immune Fab (DigiFab; 1IGJ). The Fc structure of rituximab (Rituxan; 1L6X)102 is available as well. Due to their high sequence identities, the Fab structures are strik-ingly similar. These Fab structures are aligned and shown in Figure 4. The root mean square distance (RMSD) values between these structures are less than 5 Å.

We have mapped the aggregation-prone regions predicted from amino acid sequence on to the 3D structure of Fabs. The aggregation-prone motifs in the CDRs are co-localized in the structures to yield characteristic structural motifs in the antigen binding site. These structural motifs may be involved in recogni-tion of the cognate receptors or antigens. As an example, Figure 5 shows the predicted aggregation-prone regions mapped on the Fab structure of trastuzumab (Herceptin) in complex with extracellular domain of human epidermal growth factor 2 (HER2).99 HER2 plays an important role in breast cancer. Trastuzumab (Herceptin) is designed to selectively bind with high affinity to the extracel-lular region of HER2. Two of the four aggregation-prone segments

However, in their study of amyloid aggregation of several mutants of human muscle acylphosphatase, Bemporad et al. have suggested that the high aggregation propensity of aromatic residues is attrib-uted to their hydrophobicity and intrinsic β-sheet forming propensity rather than aromaticity.91

Besides being the driving force for protein folding, the hydrophobic effect also plays an important role in protein aggregation.92,93 Most of the aggrega-tion-prone motifs in commercial mAbs (Table 2) are rich in hydrophobic residues, especially β-branched aliphatic residues. Ile, Val and Leu are among the top aggregation-prone residues reported by de Groot et al.26 Commercial mAbs contain VVCLL in CL and VVSVLTVL in CH2 (motif 5 and 15 in Table 2). These are two representative motifs with high hydropho-bicity. Stretches of hydrophobic residues, such as IAALL in Transthyretin have been shown to be aggregation promoting.71 The high aggregation propensity of hydrophobic stretches could come from two potential sources, i.e., the specific interactions mediated by the side chains of hydrophobic residues and the non-specific hydrophobicity per se.92 Kim et al. have tried to distinguish these two potential sources through systematic mutation in a study of Aβ42 peptide.92 Their findings suggest that the non-specific hydrophobicity is more important for aggregation than the specific side chain interactions.

Amyloid fibrils and β helices show similar conformations. Tsai et al.88 have performed a systematic analysis of β helices and revealed their sequence and structure characteristics. Some relevant characteristics are: (1) Charged residues except Asp disfavor the β sheet regions; (2) Asn and Phe show strong amide stacking inter-actions which stabilize β helices; (3) Ile and Val have the highest propensity to be in β strands. These features are also found in the aggregation-prone motifs identified in this study.

In order to promote aggregation, a region in a protein should satisfy the following criteria: (1) the amino acid residues in the region should have high intrinsic aggregation propensity; (2) the region should be conformationally unstable or flexible; and (3) the region should be exposed or become exposed upon conformational transition, to facilitate intermolecular interactions.77,94 In the case of commercial mAbs, the above mentioned motifs may well be buried in the native structure of mAb and, hence, blocked from aggregation. However, in case of conformational fluctuations, these motifs may become exposed. Therefore, the location of aggregation-prone segments on the 3-D structure and the confor-mational flexibility of mAb molecule are also important factors for predicting aggregation-prone regions with high accuracy. We are currently performing molecular dynamic simulation studies to further refine our predictions. The mapping of our sequence motifs on to the structures is given below.

Figure 4. Cartoon representation of the superimposed Fab structures of bevacizumab (Avastin), alemtuzumab (Campath), cetuximab (Erbitux), trastuzumab (Herceptin), basiliximab (Simulect) and palivizumab (Synagis). All light chains are in red and heavy chains in blue.

Page 12: Potential aggregation prone regions in biotherapeutics

www.landesbioscience.com mAbs 265

Aggregation prone regions in commercial mAbs

locations such as CDR loops; (2) it is possible to successfully use the methods developed for amyloidogenic peptide and proteins to predict aggregation in biotherapeutics. Sequences/motifs predicted to be involved in aggregation are seen to be important for function also. Our approach is validated by the finding that TANGO and PAGE also identified aggregation-prone motifs in non-mAb biop-harmaceuticals with available experimental aggregation data (Table 3).17 The tools and the analyses utilized in this work can thus be applied to mutational studies to optimize activity while reducing aggregation propensity at the discovery stage. The next step in prediction of aggregation-prone regions in biologic drugs shall be incorporation of the 3D structural information.

Acknowledgements

We thank Drs. James Carroll, Kevin King, Sandeep Nema, Sa V. Ho, Graeme Bainbridge and Bernard Violand for several helpful discussions. The authors are grateful to Pfizer IP committee members for review of this manuscript and for several useful comments. Drs. Arindam Bose and Thomas Warren are thanked for their interest and support for our work. A postdoctoral fellowship for Xiaoling Wang via predictive stability initiative at Pharmaceutical Research and Development, Global Biologics, in Pfizer is gratefully acknowledged. Two anonymous referees are acknowledged for their critical review of the manuscript.

on the Fab of trastuzumab (Herceptin) are located at CDR L2 and CDR L3, respectively. Up to 45% of the surface area of the two segments is buried after binding to HER2. This suggests that the two motifs are major participants in the binding sites. While we present trastu-zumab (Herceptin) HER2 extra cellular domain as an example, our observations for the other Fab structures, except palivi-zumab (Synagis), are similar (data not shown).

Our observations indicate that the sequence/structural motifs in mAb CDRs that bind the cognate receptors/antigens may also be the aggregation-prone regions. If this is true, then an interesting challenge is posed to the mAb engineers: how should potent and specific mAb therapeutics with low vulnerability towards aggregation be designed? This is a significant challenge because therapeutic mAb drug products are stored in absence of their cognate partners, leaving these motifs potentially vulnerable to aggregation. Structure based biologic drug design as part of QbD approaches can help address such issues.

The other two aggregation-prone segments in trastuzumab (Herceptin) are the two closing β-strands in CL domains of the Fabs (Fig. 5). One of the strands (SVFIFP) is at the edge of the 4-stranded β-sheet of CL which packs against the 4-stranded β-sheet of CH1 domain. The edge β-strands can readily form edge-to-edge aggregates.74,103 The amyloid fibril formation of a number of proteins, such as transthyretin (TTR), has been linked to the edge β-strand interactions.74

Summary and Conclusions

The technologies driving advances in antibody development have focused on increasing the efficacy and potency of mAbs while reducing immunogenic potential. Aggregation issues with candi-date mAbs are usually addressed at the formulation stage during drug development. However, consideration of aggregation issues at early stages of biologic drug discovery may save both time and resources while reducing risk at the later stages of drug formulation and clinical development. QbD approaches including structure based biologic drug design can be useful in addressing these chal-lenges.

The aggregation-prone regions in commercial and non-commer-cial antibodies, predicted by TANGO and PAGE, have sequence motifs reminiscent of those found in amyloidogenic proteins. As such, this is not surprising because TANGO and PAGE have been parameterized using data from amyloidogenic peptides and proteins. However, the novel findings of our study are as follows: (1) aggregation-prone motifs in commercial and non-commercial antibodies are consistent and present at functionally important

Figure 5. Structure of trastuzumab (Herceptin) Fab in complex with extracellular domain of HER2 is shown in cartoon representation. Green chain is HER2. Light and heavy chains of trastuzumab (Herceptin) are shown in gray and blue, respectively. The predicted aggregation-prone motifs are highlighted with different colors. The magenta segment is LLIYSASFLY in CDR L2 and the red is YCQQHY in CDR L3. The two motifs SVFIFP and VVCLL in LC are in cyan and green, respectively.

Page 13: Potential aggregation prone regions in biotherapeutics

Aggregation prone regions in commercial mAbs

266 mAbs 2009; Vol. 1 Issue 3

38. Nagahira K, Fukuda Y, Oyama Y, Kurihara T, Nasu T, Kawashima H, et al. Humanization of a mouse neutralizing monoclonal antibody against tumor necrosis factor-a (TNFa). J Immunol Methods 1999; 222:83-92.

39. Mueller JP, Giannoni MA, Hartman SL, Elliott EA, Squinto SP, Matis LA, Evans MJ. Humanized porcine VCAM-specific monoclonal antibodies with chimeric IgG2/G4 constant regions block human leukocyte binding to porcine endothelial cells. Mol Immunol 1997; 34:441-52.

40. Thomas TC, Rollins SA, Rother RP, Giannoni MA, Hartman SL, Elliott EA, et al. Inhibition of complement activity by humanized anti-C5 antibody and single-chain Fv. Mol Immunol 1996; 33:1389-401.

41. Johnson S, Oliver C, Prince GA, Hemming VG, Pfarr DS, Wang S-C, et al. Development of a Humanized Monoclonal Antibody (MEDI-493) with Potent In Vitro and In Vivo Activity against Respiratory Syncytial Virus. J Infect Dis 1997; 176:1215-24.

42. Morrison SL, Johnson MJ, Herzenberg LA, Oi VT. Chimeric human antibody mol-ecules: mouse antigen-binding domains with human constant region domains. Proc Natl Acad Sci USA 1984; 81:6851-5.

43. Boulianne GL, Hozumi N, Shulman MJ. Production of functional chimaeric mouse/human antibody. Nature 1984; 312:643-6.

44. Neuberger MS, Williams GT, Mitchell EB, Jouhal SS, Flanagan JG, Rabbitts TH. A hapten-specific chimaeric IgE antibody with human physiological effector function. Nature 1985; 314:268-70.

45. Jones PT, Dear PH, Foote J, Neuberger MS, Winter G. Replacing the complementar-ity-determining regions in a human antibody with those from a mouse. Nature 1986; 321:522-5.

46. Riechmann L, Clark M, Waldmann H, Winter G. Reshaping human antibodies for therapy. Nature 1988; 332:323-7.

47. Smith GP. Filamentous fusion phage: novel expression vectors that display cloned anti-gens on the virion surface. Science 1985; 228:1315-7.

48. Schmitz U, Versmold A, Kaufmann P, Frank HG. Phage Display: A Molecular Tool for the Generation of Antibodies—A Review. Placenta 2000; 21:106-12.

49. Lonberg N, Taylor LD, Harding FA, Trounstine M, Higgins KM, Schramm SR, et al. Antigen-specific human antibodies from mice comprising four distinct genetic modifica-tions. Nature 1994; 368:856-9.

50. Green LL, Hardy MC, Maynard-Currie CE, Tsuda H, Louie DM, Mendez MJ, et al. Antigen-specific human monoclonal antibodies from mice engineered with human Ig heavy and light chain YACs. Nat Genet 1994; 7:13-21.

51. Lonberg N. Human antibodies from transgenic animals. Nat Biotechnol 2005; 23:1117-25.

52. Wark KL, Hudson PJ. Latest technologies for the enhancement of antibody affinity. Adv Drug Deliv Rev 2006; 58:657-70.

53. Wishart DS, Knox C, Guo AC, Shrivastava S, Massanali M, Stothard P, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 2006; 34:668-72.

54. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res 2000; 28:235-42.

55. Burton DR, Woof JM. Human Antibody Effector Function. Adv Immunol 1992; 51:1-84.

56. Jefferis R. Antibody therapeutics:isotype and glycoform selection. Expert Opin Biol Ther 2007; 7:1401-13.

57. Beenhouwer DO, Yoo EM, Lai C-W, Rocha MA, Morrison SL. Human Immunoglobulin G2 (IgG2) and IgG4, but Not IgG1 or IgG3, Protect Mice against Cryptococcus neofor-mans Infection. Infect Immun 2007; 75:1424-35.

58. Dillon TM, Ricci MS, Vezina C, Flynn GC, Liu YD, Rehder DS, et al. Structural and Functional Characterization of Disulfide Isoforms of the Human IgG2 Subclass. J Biol Chem 2008; 283:16206-15.

59. Martinez T, Guo A, Allen MJ, Han M, Pace D, Jones J, et al. Disulfide Connectivity of Human Immunoglobulin G2 Structural Isoforms. Biochemistry 2008; 47:7496-508.

60. Wypych J, Li M, Guo A, Zhang Z, Martinez T, Allen MJ, et al. Human IgG2 Antibodies Display Disulfide-mediated Structural Isoforms. J Biol Chem 2008; 283:16194-205.

61. Almagro JC, Hernández I, Ramírez MC, Vargas-Madrazo E. Structural differences between the repertoires of mouse and human germline genes and their evolutionary implications. Immunogenetics 1998; 47:355-63.

62. Pink JRL, Milstein C. Inter Heavy-Light Chain Disulphide Bridge in Immune Globulins. Nature 1967; 214:92-4.

63. Frangione B, Milstein C, Franklin EC. Intrachain disulphide bridges in immunoglobulin G heavy chains. The Fc fragment. Biochem J 1968; 106:15-20.

64. Rother RP, Rollins SA, Mojcik CF, Brodsky RA, Bell L. Discovery and development of the complement inhibitor eculizumab for the treatment of paroxysmal nocturnal hemo-globinuria. Nat Biotechnol 2007; 25:1256-64.

65. Chiti F, Taddei N, Baroni F, Capanni C, Stefani M, Ramponi G, Dobson CM. Kinetic partitioning of protein folding and aggregation. Nat Struct Mol Biol 2002; 9:137-43.

66. Nelson R, Sawaya MR, Balbirnie M, Madsen AO, Riekel C, Grothe R, Eisenberg D. Structure of the cross-b spine of amyloid-like fibrils. Nature 2005; 435:773-8.

67. Zheng J, Ma B, Nussinov R. Consensus features in amyloid fibrils: sheet-sheet recogni-tion via a (polar or nonpolar) zipper structure. Phys Biol 2006; 3:1-4.

References 1. Kohler G, Milstein C. Continuous cultures of fused cells secreting antibody of pre-

defined specificity. Nature 1975; 256:495-7. 2. Cohen SN, Chang ACY, Boyer HW, Helling RB. Construction of Biologically

Functional Bacterial Plasmids in vitro. Proc Natl Acad Sci USA 1973; 70:3240-4. 3. McCafferty J, Griffiths AD, Winter G, Chiswell DJ. Phage antibodies: filamentous

phage displaying antibody variable domains. Nature 1990; 348:552-4. 4. Reichert JM, Rosensweig CJ, Faden LB, Dewitz MC. Monoclonal antibody successes in

the clinic. Nat Biotechnol 2005; 23:1073-8. 5. Branden C, Tooze J. Introduction to Protein Structure. Garland Publishing 1998. 6. Janeway CA, Travers P, Walport M, Sholmchik MJ. Immunobiology. Garland Science

2004. 7. Davies DR, Chacko S. Antibody structure. Acc Chem Res 1993; 26:421-7. 8. Harris LJ, Larson SB, Hasel KW, Day J, Greenwood A, McPherson A. The three-

dimensional structure of an intact monoclonal antibody for canine lymphoma. Nature 1992; 360:369-72.

9. Harris LJ, Skaletsky E, McPherson A. Crystallographic structure of an intact IgG1 monoclonal antibody. J Mol Biol 1998; 275:861-72.

10. Protein Formulation and Delivery. Informa Healthcare 1999. 11. Dobson CM. Protein folding and misfolding. Nature 2003; 426:884-90. 12. De Young LR, Fink AL, Dill KA. Aggregation of globular proteins. Acc Chem Res 1993;

26:614-20. 13. Manning MC, Patel K, Borchardt RT. Stability of Protein Pharmaceuticals. Pharm Res

1989; 6:903-18. 14. Braun A, Kwee L, Labow MA, Alsenz J. Protein Aggregates Seem to Play a Key Role

Among the Parameters Influencing the Antigenicity of Interferon Alpha (IFNα) in Normal and Transgenic Mice. Pharm Res 1997; 14:1472-8.

15. Wang W. Protein aggregation and its inhibition in biopharmaceutics. Int J Pharm 2005; 289:1-30.

16. Rosenberg AS. Effects of Protein Aggregates: An Immunologic Perspective. AAPS J 2006; 8:501-7.

17. Maas C, Hermeling S, Bouma B, Jiskoot W, Gebbink MFBG. A Role for Protein Misfolding in Immunogenicity of Biopharmaceuticals. J Biol Chem 2007; 282:2229-36.

18. Hermeling S, Crommelin D, Schellekens H, Jiskoot W. Structure-Immunogenicity Relationships of Therapeutic Proteins. Pharm Res 2004; 21:897-903.

19. Shire SJ, Shahrokh Z, Liu J. Challenges in the development of high protein concentra-tion formulations. J Pharm Sci 2004; 93:1390-402.

20. Sukumar M, Doyle B, Combs J, Pekar A. Opalescent Appearance of an IgG1 Antibody at High Concentrations and Its Relationship to Noncovalent Association. Pharm Res 2004; 21:1087-93.

21. Kelly JW. Attacking amyloid. N Engl J Med 2005; 352:722-3. 22. Sipe JD, Cohen AS. Review: history of the amyloid fibril. J Struct Biol 2000; 130:88-98. 23. Kelly JW. Amyloid fibril formation and protein misassembly: a structural quest for

insights into amyloid and prion diseases. Structure 1997; 5:595-600. 24. Fernandez-Escamilla A-M, Rousseau F, Schymkowitz J, Serrano L. Prediction of

sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nat Biotechnol 2004; 22:1302-6.

25. Tartaglia GG, Cavalli A, Pellarin R, Caflisch A. Prediction of aggregation rate and aggregation-prone segments in polypeptide sequences. Protein Sci 2005; 14:2723-34.

26. de Groot N, Pallares I, Aviles F, Vendrell J, Ventura S. Prediction of “hot spots” of aggre-gation in disease-linked polypeptides. BMC Struct Biol 2005; 5:18.

27. Tuite MF. Yeast Prions and Their Prion-Forming Domain. Cell 2000; 100:289-92. 28. Azriel R, Gazit E. Analysis of the Minimal Amyloid-forming Fragment of the Islet

Amyloid Polypeptide. An Experimental Support for the Key Role of the Phenylalanine Residue in Amyloid Formation. J Biol Chem 2001; 276:34156-61.

29. Wang W, Singh S, Zeng DL, King K, Nema S. Antibody structure, instability and for-mulation. J Pharm Sci 2007; 96:1-26.

30. Adams GP, Weiner LM. Monoclonal antibody therapy of cancer. Nat Biotechnol 2005; 23:1147-57.

31. Baker M. Upping the ante on antibodies. Nat Biotechnol 2005; 23:1065-72. 32. Kipriyanov S, Little M. Generation of recombinant antibodies. Mol Biotechnol 1999;

12:173-201. 33. Chadd HE, Chamow SM. Therapeutic antibody expression technology. Curr Opin

Biotechnol 2001; 12:188-94. 34. Carter PJ. Potent antibody therapeutics by design. Nat Rev Immunol 2006; 6:343-57. 35. Pendley C, Schantz A, Wagner C. Immunogenicity of therapeutic monoclonal antibod-

ies. Curr Opin Mol Ther 2003; 5:172-9. 36. Co MS, Avdalovic NM, Caron PC, Avdalovic MV, Scheinberg DA, Queen C. Chimeric

and humanized antibodies with specificity for the CD33 antigen. J Immunol 1992; 148:1149-54.

37. Werther WA, Gonzalez TN, O’Connor SJ, McCabe S, Chan B, Hotaling T, et al. Humanization of an anti-lymphocyte function-associated antigen (LFA)-1 monoclonal antibody and reengineering of the humanized antibody for binding to rhesus LFA-1. J Immunol 1996; 157:4986-95.

Page 14: Potential aggregation prone regions in biotherapeutics

www.landesbioscience.com mAbs 267

Aggregation prone regions in commercial mAbs

96. Muller YA, Chen Y, Christinger HW, Li B, Cunningham BC, Lowman HB, de Vos AM. VEGF and the Fab fragment of a humanized neutralizing antibody: crystal structure of the complex at 2.4 Å resolution and mutational analysis of the interface. Structure 1998; 6:1153-67.

97. James LC, Hale G, Waldman H, Bloomer AC. 1.9 Å structure of the therapeutic anti-body CAMPATH-1H fab in complex with a synthetic peptide antigen. J Mol Biol 1999; 289:293-301.

98. Li S, Schmitz KR, Jeffrey PD, Wiltzius JJW, Kussie P, Ferguson KM. Structural basis for inhibition of the epidermal growth factor receptor by cetuximab. Cancer cell 2005; 7:301-11.

99. Cho H-S, Mason K, Ramyar KX, Stanley AM, Gabelli SB, Denney DW, Leahy DJ. Structure of the extracellular region of HER2 alone and in complex with the Herceptin Fab. Nature 2003; 421:756-60.

100. Kjer-Nielsen L, Dunstone MA, Kostenko L, Ely LK, Beddoe T, Mifsud NA, et al. Crystal structure of the human T cell receptor CD3eg heterodimer complexed to the therapeutic mAb OKT3. Proc Natl Acad Sci USA 2004; 101:7675-80.

101. Mikol V. Structure of the Fab Fragment of SDZ CHI621: a Chimeric Antibody Against CD25. Acta Crystallogr D Biol Crystallogr 1996; 52:534-42.

102. Idusogie EE, Presta LG, Gazzano-Santoro H, Totpal K, Wong PY, Ultsch M, et al. Mapping of the C1q Binding Site on Rituxan, a Chimeric Antibody with a Human IgG1 Fc. J Immunol 2000; 164:4178-84.

103. Richardson JS, Richardson DC. Natural b-sheet proteins use negative design to avoid edge-to-edge aggregation. Proc Natl Acad Sci USA 2002; 99:2754-9.

68. Jones S, Manning J, Kad NM, Radford SE. Amyloid-forming Peptides from b2-Micro-globulin—Insights into the Mechanism of Fibril Formation in vitro. J Mol Biol 2003; 325:249-57.

69. Bourne PC, Ramsland PA, Shan L, Fan Z-C, DeWitt CR, Shultz BB, et al. Three-dimensional structure of an immunoglobulin light-chain dimer with amyloidogenic properties. Acta Crystallogr D Biol Crystallogr 2002; 58:815-23.

70. Pokkuluri RR, Solomon A, Weiss DT, Stevens FJ, Schiffer M. Tertiary structure of human 16 light chains. Amyloid 1999; 6:165-71.

71. Jaroniec CP, MacPhee CE, Astrof NS, Dobson CM, Griffin RG. Molecular conforma-tion of a peptide fragment of transthyretin in an amyloid fibril. Proc Natl Acad Sci USA 2002; 99:16748-53.

72. Westermark P. Aspects on human amyloid forms and their fibril polypeptides. FEBS J 2005; 272:5942-9.

73. Vetri V, Canale C, Relini A, Librizzi F, Militello V, Gliozzi A, Leone M. Amyloid fibrils formation and amorphous aggregation in concanavalin A. Biophys Chem 2007; 125:184-90.

74. Siepen JA, Radford SE, Westhead DR. b Edge strands in protein structure prediction and aggregation. Protein Sci 2003; 12:2348-59.

75. Pande A, Annunziata O, Asherie N, Ogun O, Benedek GB, Pande J. Decrease in Protein Solubility and Cataract Formation Caused by the Pro23 to Thr Mutation in Human gD-Crystallin. Biochemistry 2005; 44:2491-500.

76. Chiti F, Calamai M, Taddei N, Stefani M, Ramponi G, Dobson CM. Studies of the aggregation of mutant proteins in vitro provide insights into the genetics of amyloid diseases. Proc Natl Acad Sci USA 2002; 99:16419-26.

77. Tartaglia GG, Pawar AP, Campioni S, Dobson CM, Chiti F, Vendruscolo M. Prediction of Aggregation-Prone Regions in Structured Proteins. J Mol Biol 2008; 380:425-36.

78. Thompson MJ, Sievers SA, Karanicolas J, Ivanova MI, Baker D, Eisenberg D. The 3D profile method for identifying fibril-forming segments of proteins. Proc Natl Acad Sci USA 2006; 103:4074-8.

79. Trevino SR, Scholtz JM, Pace CN. Amino Acid Contribution to Protein Solubility: Asp, Glu and Ser Contribute more Favorably than the other Hydrophilic Amino Acids in RNase Sa. J Mol Biol 2007; 366:449-60.

80. Chiti F, Dobson CM. Protein Misfolding, Functional Amyloid and Human Disease. Annu Rev Biochem 2006; 75:333-66.

81. Creighton TE. Proteins: Structures and Molecular Properties W.H. Freeman and Company 1993.

82. Pieri L, Bucciantini M, Nosi D, Formigli L, Savistchenko J, Melki R, Stefani M. The Yeast Prion Ure2p Native-like Assemblies Are Toxic to Mammalian Cells Regardless of Their Aggregation State. J Biol Chem 2006; 281:15337-44.

83. Michelitsch MD, Weissman JS. A census of glutamine/asparagine-rich regions: Implications for their conserved function and the prediction of novel prions. Proc Natl Acad Sci USA 2000; 97:11910-5.

84. Chen SM, Berthelier VB, Yang W, Wetzel R. Polyglutamine Aggregation Behavior in vitro Supports a Recruitment Mechanism of Cytotoxicity. J Mol Biol 2001; 311:173-82.

85. Yang W, Dunlap JR, Andrews RB, Wetzel R. Aggregated polyglutamine peptides deliv-ered to nuclei are toxic to mammalian cells. Hum Mol Genet 2002; 11:2905-17.

86. Zoghbi HY, Orr HT. Glutamine repeats and neurodegeneration. Annu Rev Neurosci 2000; 23:217-47.

87. Perutz MK, Johnson T, Suzuki M, Finch JT. Glutamine repeats as polar zippers—Their possible role in inherited neurodegenerative diseases. Proc Natl Acad Sci USA 1994; 91:5355-8.

88. Tsai H-H, Gunasekaran K, Nussinov R. Sequence and Structure Analysis of Parallel b Helices: Implication for Constructing Amyloid Structural Models. Structure 2006; 14:1059-72.

89. Vitalis A, Wang X, Pappu RV. Atomistic Simulations of the Effects of Polyglutamine Chain Length and Solvent Quality on Conformational Equilibria and Spontaneous Homodimerization. J Mol Biol 2008; 384:279-97.

90. Armen RS, Bernard BM, Day R, Alonso DOV, Daggett V. Characterization of a possible amyloidogenic precursor in glutamine-repeat neurodegenerative diseases. Proc Natl Acad Sci USA 2005; 102:13433-8.

91. Bemporad F, Taddei N, Stefani M, Chiti F. Assessing the role of aromatic residues in the amyloid aggregation of human muscle acylphosphatase. Protein Sci 2006; 15:862-70.

92. Kim W, Hecht MH. Generic hydrophobic residues are sufficient to promote aggregation of the Alzheimer’s Ab42 peptide. Proc Natl Acad Sci USA 2006; 103:15824-9.

93. Monsellier E, Chiti F. Prevention of amyloid-like aggregation as a driving force of protein evolution. EMBO Rep 2007; 8:737-42.

94. Uversky VN, Fink AL. Conformational constraints for amyloid fibrillation: the importance of being unfolded. Biochim Biophys Acta—Proteins and Proteomics 2004; 1698:131-53.

95. Saphire EO, Stanfield RL, Max Crispin MD, Parren PWHI, Rudd PM, Dwek RA, et al. Contrasting IgG Structures Reveal Extreme Asymmetry and Flexibility. J Mol Biol 2002; 319:9-18.