host cell proteins (hcps) in plasma-derived biotherapeutics · host cell proteins (hcps) in...

1
Host Cell Proteins (HCPs) in Plasma-Derived Biotherapeutics Ilker Sen 1 , Laura Smoyer 2 , St John Skilton 1 , Marshall Bern 1 , Eric Carlson 1 , Kevin Van Cott 2 1 Protein Metrics, San Carlos, CA 2 University of Nebraska-Lincoln Contact: [email protected] Abstract Contaminant residual host cell proteins (HCPs) in biotherapeutics may pose safety or stability risks. HCPs are typically found at low levels in highly purified proteins, and need to be monitored per regulatory guidelines. Most biotherapeutics are recombinantly expressed in Chinese Hamster Ovary (CHO) cells, and thus the HCPs monitored in these samples originate from the CHO host cells or the media source used in cell culture (mostly bovine). Another category of biotherapeutics are those that are derived from human plasma . For human plasma-derived products, immunogenicity is not usually a concern, unless the HCPs are in modified form, such as aggregation or oxidation. Instead, the primary concern is the biological function of the HCPs. HCP detection and measuring is particularly challenging for plasma proteins, mainly due to the highly glycosylated proteins found in humans. Here, we present a mass spectrometry and analysis workflow to identify and quantify host cell proteins for plasma-derived products. Methods Protein Metrics software can automatically process data from Agilent, Bruker, Sciex, Shimadzu, Thermo, and Waters instruments Beta-2-glycoprotein 1 was purified from donor human plasma and used as a model system. We spiked known protein digest standards to trypsinized protein at 1:50 and 1:1000 ratios and injected to Waters Synapt G2S mass spectrometer in MS E mode. Data analysis was performed using Protein Metrics software. Briefly, comprehensive identification of peptides was performed searching a Uniprot-human protein database. All identified peptides and proteins were quantified by extracted ion chromatogram (XIC), automatically derived from the identified peptides using the mass of the precursor ion. Fragment errors and spectra are displayed for the user to review, and a mechanism to differentiate between true- and false-positives is provided. A pre-prepared report displays the data in a variety of tables and graphs according to user-defined settings. Host “Cell” = Human Plasma Byos automatically produces a report from the tabulated data via a ‘pivot table’ summary. The pivot table summary is the key reporting mechanism for data presentable to specialists and non-specialists alike. Graphs, tables, heat maps, bar charts and other representations are available from a simple drop-down menu of visualization types. Challenge : Plasma proteins are glycosylated! Solution : Byonic - site-specific glycosylation analysis The underlying identification in Byos is provided by the Byonicsearch engine. In this data we identified glycopeptides to the level of peptide sequence and glycan composition. Several useful pre-prepared glycan databases are provided. These can be modified or the user can create their own. Protein Metrics has developed a customizable Pivot Table format for easy reporting. For Host Cell Proteins, a pre-set format is available. The data is displayed in a variety of visualizations, and user and audit data are automatically listed in a summary tab. Here, Bruker data is shown in a heatmap with percentage values in the example at the left. Table: Identified host-cell proteins in B2G1 sample. Quantification was performed by summing the XICs of the top 3 most intense peptides in each protein, followed by normalizing the values with respect to the product. Thus the HCPs are represented as % of product. + Glyc column shows % abundance of proteins when the data is searched with glycopeptides, and Glyc column shows the results when glycopeptides are excluded from search. is a difference of + and Glyc columns. Proteins with = 100% are those that can only be identified with a search engine capable of searching glycopeptides, like Byonic. Proteins with = 0% are those that have same ID and quan with and without glycan searched, and protein with between 0 and 100 are those that could be identified without glycopeptide search, but their quantification would be skewed towards lower than actual abundance. Note that the user is able to choose whether to normalize against the ‘Sum’ of all identified proteins, the ‘Maximumidentified protein, or a ‘Custom’ spiked in protein. This allows the analyst to cope with a variety of scenarios, such as whether a detector is saturated, or the relative amounts of protein are biased in other ways. An interesting case is shown with . Carboxypeptidase D, a protease that would be an undesired contaminant in a biopharmaceutical product, can only be identified when searched with glycopeptides. Discussion and Conclusions Glycosylated proteins are numerous in mammalian proteomes and provide challenges in identifying them as host cell proteins. Glycopeptide search capabilities are essential in identifying and quantifying these residual HCPs. The wide variety of data sources for HCP studies means that a coherent mechanism to analyze, quantify, and present the data is beneficial to any laboratory aiming to achieve standardization. Being able to cope with various strategies in identification and quantitation of HCPs is also beneficial. Whereas in some instances a simple spiked in standard may suffice, in other instances the quantification may need to be done against a specific (biotherapeutic) protein, or the sum total of all HCPs may be needed. Therefore having a simple tool to produce that variety of data instantly can benefit a laboratory’s view of the HCP profile. The mechanisms shown here provide a number of advantages: Ability to present data to non-experts and avoidance of mass spectrometry jargon Lower barrier to staff training, and use of pre-set templates Consistent analysis irrespective of user, or laboratory, reducing the risk of human bias Reduction or elimination of the need for cutting and pasting data from spreadsheets. Reduced reliance on vendor software especially where it is designed for other purposes A choice of mechanisms for quantitation that can be adapted to the philosophy of the organization. www.proteinmetrics.com Product Spike 1 Spike 2 Protein % Abundance + Glyc -Glyc sp|P02749|APOH_HUMAN Beta-2-glycoprotein 1 100 100 0% sp|Q3B7T1|EDRF1_HUMAN Erythroid differentiation-related factor 1 77.7 38.7 50% sp|P35916|VGFR3_HUMAN Vascular endothelial growth factor receptor 3 66 53.7 19% sp|Q92673|SORL_HUMAN Sortilin-related receptor 60.3 55.7 8% sp|Q7Z408|CSMD2_HUMAN CUB and sushi domain-containing protein 2 40.8 20.4 50% sp|P24347|MMP11_HUMAN Stromelysin-3 32.4 32.4 0% sp|Q496J9|SV2C_HUMAN Synaptic vesicle glycoprotein 2C 32.2 100% sp|Q8TD84|DSCL1_HUMAN Down syndrome cell adhesion molecule-like protein 1 30.6 19.4 37% sp|O75976|CBPD_HUMAN Carboxypeptidase D 28.7 100% sp|Q9NR61|DLL4_HUMAN Delta-like protein 4 26.1 21.6 17% sp|O75376|NCOR1_HUMAN Nuclear receptor corepressor 1 24.6 22.7 8% sp|Q2M3G0|ABCB5_HUMAN ATP-binding cassette sub-family B member 5 19.5 2.24 89% sp|P00330|ADH1_YEAST Alcohol dehydrogenase 1 13.8 13.8 0% sp|A8MUP6|GS1L2_HUMAN Germ cell-specific gene 1-like protein 2 12.9 7.29 43% sp|Q3ZCX4|ZN568_HUMAN Zinc finger protein 568 12.7 8.24 35% sp|Q8N3J3|CQ053_HUMAN Uncharacterized protein C17orf53 12.3 100% sp|P27918|PROP_HUMAN Properdin 10.8 10.8 0% sp|Q9HC62|SENP2_HUMAN Sentrin-specific protease 2 9.11 100% sp|Q9Y4C5|CHST2_HUMAN Carbohydrate sulfotransferase 2 8.44 3.67 57% sp|Q8N5I4|DHRSX_HUMAN Dehydrogenase/reductase SDR family member on chromosome X 8.09 100% sp|P49721|PSB2_HUMAN Proteasome subunit beta type-2 6.84 100% sp|P02790|HEMO_HUMAN Hemopexin 6.77 3.42 49% sp|Q6NUP7|PP4R4_HUMAN Serine/threonine-protein phosphatase 4 regulatory subunit 4 4.1 4.1 0% sp|Q9H3Q3|G3ST2_HUMAN Galactose-3-O-sulfotransferase 2 4.01 100% sp|P59047|NALP5_HUMAN NACHT, LRR and PYD domains-containing protein 5 3.69 0.402 89% sp|Q3MIR4|CC50B_HUMAN Cell cycle control protein 50B 2.71 2.38 12% sp|Q96P66|GP101_HUMAN Probable G-protein coupled receptor 101 2.64 2.64 0% sp|Q92954|PRG4_HUMAN Proteoglycan 4 2.61 0.787 70% sp|TRYP_PIG|(Common contaminant protein) 2.51 2.51 0% sp|Q9NTJ4|MA2C1_HUMAN Alpha-mannosidase 2C1 1.91 100% sp|P02763|A1AG1_HUMAN Alpha-1-acid glycoprotein 1 1.86 0.324 83% sp|P00924|ENO1_YEAST Enolase 1 0.437 0.437 0% sp|P02766|TTHY_HUMAN Transthyretin 0.869 0.544 37% sp|P50336|PPOX_HUMAN Protoporphyrinogen oxidase 0.709 0.709 0% sp|P00738|HPT_HUMAN Haptoglobin 0.623 0.623 0% Identify, Quantify…. …Report Glycopeptide spectrum from

Upload: dokien

Post on 04-Jun-2019

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Host Cell Proteins (HCPs) in Plasma-Derived Biotherapeutics · Host Cell Proteins (HCPs) in Plasma-Derived Biotherapeutics Ilker Sen1, Laura Smoyer2, St John Skilton1, Marshall Bern1,

Host Cell Proteins (HCPs) in Plasma-Derived BiotherapeuticsIlker Sen1, Laura Smoyer2, St John Skilton1, Marshall Bern1, Eric Carlson1, Kevin Van Cott2

1 Protein Metrics, San Carlos, CA 2 University of Nebraska-Lincoln Contact: [email protected]

Abstract

Contaminant residual host cell proteins (HCPs) in biotherapeutics may

pose safety or stability risks. HCPs are typically found at low levels in

highly purified proteins, and need to be monitored per regulatory

guidelines.

Most biotherapeutics are recombinantly expressed in Chinese Hamster

Ovary (CHO) cells, and thus the HCPs monitored in these samples

originate from the CHO host cells or the media source used in cell culture

(mostly bovine). Another category of biotherapeutics are those that are

derived from human plasma. For human plasma-derived products,

immunogenicity is not usually a concern, unless the HCPs are in modified

form, such as aggregation or oxidation. Instead, the primary concern is

the biological function of the HCPs.

HCP detection and measuring is particularly challenging for plasma

proteins, mainly due to the highly glycosylated proteins found in humans.

Here, we present a mass spectrometry and analysis workflow to identify

and quantify host cell proteins for plasma-derived products.

Methods

Protein Metrics

software can

automatically process

data from Agilent,

Bruker, Sciex,

Shimadzu, Thermo,

and Waters

instruments

Beta-2-glycoprotein 1 was purified from donor human plasma and used as

a model system. We spiked known protein digest standards to trypsinized

protein at 1:50 and 1:1000 ratios and injected to Waters Synapt G2S mass

spectrometer in MSE mode.

Data analysis was performed using Protein Metrics software. Briefly,

comprehensive identification of peptides was performed searching a

Uniprot-human protein database. All identified peptides and proteins were

quantified by extracted ion chromatogram (XIC), automatically derived from

the identified peptides using the mass of the precursor ion. Fragment

errors and spectra are displayed for the user to review, and a mechanism

to differentiate between true- and false-positives is provided.

A pre-prepared report displays the data in a variety of tables and graphs

according to user-defined settings.

Host “Cell” = Human Plasma

Byos automatically produces a report from the tabulated data via a ‘pivot

table’ summary. The pivot table summary is the key reporting mechanism for

data presentable to specialists and non-specialists alike. Graphs, tables,

heat maps, bar charts and other representations are available from a simple

drop-down menu of visualization types.

Challenge: Plasma proteins are glycosylated!

Solution: Byonic - site-specific glycosylation analysis

The underlying

identification in Byos is

provided by the

Byonic™ search

engine. In this data we

identified glycopeptides

to the level of peptide

sequence and glycan

composition. Several

useful pre-prepared

glycan databases are

provided. These can

be modified or the user

can create their own.

Protein Metrics has developed a

customizable Pivot Table format for

easy reporting. For Host Cell

Proteins, a pre-set format is

available. The data is displayed in a

variety of visualizations, and user

and audit data are automatically

listed in a summary tab. Here,

Bruker data is shown in a heatmap

with percentage values in the

example at the left.

Table: Identified host-cell proteins in B2G1 sample. Quantification was performed by summing the XICs of the top 3 most intense peptides in each

protein, followed by normalizing the values with respect to the product. Thus the HCPs are represented as % of product. + Glyc column shows %

abundance of proteins when the data is searched with glycopeptides, and – Glyc column shows the results when glycopeptides are excluded from

search. ∆ is a difference of + and – Glyc columns. Proteins with ∆ = 100% are those that can only be identified with a search engine capable of

searching glycopeptides, like Byonic. Proteins with ∆ = 0% are those that have same ID and quan with and without glycan searched, and protein with ∆

between 0 and 100 are those that could be identified without glycopeptide search, but their quantification would be skewed towards lower than actual

abundance. Note that the user is able to choose whether to normalize against the ‘Sum’ of all identified proteins, the ‘Maximum’ identified protein, or a

‘Custom’ spiked in protein. This allows the analyst to cope with a variety of scenarios, such as whether a detector is saturated, or the relative amounts

of protein are biased in other ways. An interesting case is shown with . Carboxypeptidase D, a protease that would be an undesired contaminant in a

biopharmaceutical product, can only be identified when searched with glycopeptides.

Discussion and Conclusions

Glycosylated proteins are numerous in mammalian proteomes and provide challenges in identifying them as host cell proteins.

Glycopeptide search capabilities are essential in identifying and quantifying these residual HCPs. The wide variety of data

sources for HCP studies means that a coherent mechanism to analyze, quantify, and present the data is beneficial to any

laboratory aiming to achieve standardization.

Being able to cope with various strategies in identification and quantitation of HCPs is also beneficial. Whereas in some

instances a simple spiked in standard may suffice, in other instances the quantification may need to be done against a specific

(biotherapeutic) protein, or the sum total of all HCPs may be needed. Therefore having a simple tool to produce that variety of

data instantly can benefit a laboratory’s view of the HCP profile. The mechanisms shown here provide a number of

advantages:

• Ability to present data to non-experts and avoidance of mass spectrometry jargon

• Lower barrier to staff training, and use of pre-set templates

• Consistent analysis irrespective of user, or laboratory, reducing the risk of human bias

• Reduction or elimination of the need for cutting and pasting data from spreadsheets.

• Reduced reliance on vendor software – especially where it is designed for other purposes

• A choice of mechanisms for quantitation that can be adapted to the philosophy of the organization.

www.proteinmetrics.com

Beta-2-glycoprotein 1 – HCPs

Product

Spike 1

Spike 2

Protein% Abundance

+ Glyc -Glyc ∆sp|P02749|APOH_HUMAN Beta-2-glycoprotein 1 100 100 0%

sp|Q3B7T1|EDRF1_HUMAN Erythroid differentiation-related factor 1 77.7 38.7 50%

sp|P35916|VGFR3_HUMAN Vascular endothelial growth factor receptor 3 66 53.7 19%

sp|Q92673|SORL_HUMAN Sortilin-related receptor 60.3 55.7 8%

sp|Q7Z408|CSMD2_HUMAN CUB and sushi domain-containing protein 2 40.8 20.4 50%

sp|P24347|MMP11_HUMAN Stromelysin-3 32.4 32.4 0%

sp|Q496J9|SV2C_HUMAN Synaptic vesicle glycoprotein 2C 32.2 100%

sp|Q8TD84|DSCL1_HUMAN Down syndrome cell adhesion molecule-like protein 1 30.6 19.4 37%

sp|O75976|CBPD_HUMAN Carboxypeptidase D 28.7 100%

sp|Q9NR61|DLL4_HUMAN Delta-like protein 4 26.1 21.6 17%

sp|O75376|NCOR1_HUMAN Nuclear receptor corepressor 1 24.6 22.7 8%

sp|Q2M3G0|ABCB5_HUMAN ATP-binding cassette sub-family B member 5 19.5 2.24 89%

sp|P00330|ADH1_YEAST Alcohol dehydrogenase 1 13.8 13.8 0%

sp|A8MUP6|GS1L2_HUMAN Germ cell-specific gene 1-like protein 2 12.9 7.29 43%

sp|Q3ZCX4|ZN568_HUMAN Zinc finger protein 568 12.7 8.24 35%

sp|Q8N3J3|CQ053_HUMAN Uncharacterized protein C17orf53 12.3 100%

sp|P27918|PROP_HUMAN Properdin 10.8 10.8 0%

sp|Q9HC62|SENP2_HUMAN Sentrin-specific protease 2 9.11 100%

sp|Q9Y4C5|CHST2_HUMAN Carbohydrate sulfotransferase 2 8.44 3.67 57%

sp|Q8N5I4|DHRSX_HUMAN Dehydrogenase/reductase SDR family member on chromosome X 8.09 100%

sp|P49721|PSB2_HUMAN Proteasome subunit beta type-2 6.84 100%

sp|P02790|HEMO_HUMAN Hemopexin 6.77 3.42 49%

sp|Q6NUP7|PP4R4_HUMAN Serine/threonine-protein phosphatase 4 regulatory subunit 4 4.1 4.1 0%

sp|Q9H3Q3|G3ST2_HUMAN Galactose-3-O-sulfotransferase 2 4.01 100%

sp|P59047|NALP5_HUMAN NACHT, LRR and PYD domains-containing protein 5 3.69 0.402 89%

sp|Q3MIR4|CC50B_HUMAN Cell cycle control protein 50B 2.71 2.38 12%

sp|Q96P66|GP101_HUMAN Probable G-protein coupled receptor 101 2.64 2.64 0%

sp|Q92954|PRG4_HUMAN Proteoglycan 4 2.61 0.787 70%

sp|TRYP_PIG|(Common contaminant protein) 2.51 2.51 0%

sp|Q9NTJ4|MA2C1_HUMAN Alpha-mannosidase 2C1 1.91 100%

sp|P02763|A1AG1_HUMAN Alpha-1-acid glycoprotein 1 1.86 0.324 83%

sp|P00924|ENO1_YEAST Enolase 1 0.437 0.437 0%

sp|P02766|TTHY_HUMAN Transthyretin 0.869 0.544 37%

sp|P50336|PPOX_HUMAN Protoporphyrinogen oxidase 0.709 0.709 0%

sp|P00738|HPT_HUMAN Haptoglobin 0.623 0.623 0%

Identify, Quantify…. …Report

Glycopeptide

spectrum

from