4,930
Views
4
CrossRef citations to date
0
Altmetric
Report

Development of in silico models to predict viscosity and mouse clearance using a comprehensive analytical data set collected on 83 scaffold-consistent monoclonal antibodies

ORCID Icon, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , & show all
Article: 2256745 | Received 29 May 2023, Accepted 05 Sep 2023, Published online: 12 Sep 2023

ABSTRACT

Biologic drug discovery pipelines are designed to deliver protein therapeutics that have exquisite functional potency and selectivity while also manifesting biophysical characteristics suitable for manufacturing, storage, and convenient administration to patients. The ability to use computational methods to predict biophysical properties from protein sequence, potentially in combination with high throughput assays, could decrease timelines and increase the success rates for therapeutic developability engineering by eliminating lengthy and expensive cycles of recombinant protein production and testing. To support development of high-quality predictive models for antibody developability, we designed a sequence-diverse panel of 83 effector functionless IgG1 antibodies displaying a range of biophysical properties, produced and formulated each protein under standard platform conditions, and collected a comprehensive package of analytical data, including in vitro assays and in vivo mouse pharmacokinetics. We used this robust training data set to build machine learning classifier models that can predict complex protein behavior from these data and features derived from predicted and/or experimental structures. Our models predict with 87% accuracy whether viscosity at 150 mg/mL is above or below a threshold of 15 centipoise (cP) and with 75% accuracy whether the area under the plasma drug concentration–time curve (AUC0–672 h) in normal mouse is above or below a threshold of 3.9 × 106 h x ng/mL.

Introduction

Biologic therapeutics have established themselves as powerful and essential medicines for patients; in 2021, biologic therapeutics accounted for 29 of the top 50 best-selling drugs and generated >$231 billion in sales.Citation1 Post-pandemic, monoclonal antibodies (mAbs) remain the single biologic modality that generates the most revenue ($110 billion), topping mRNA vaccines ($79 billion). Indeed, 23 of the 25 best-selling protein drugs are antibodies or contain antibody-derived domains, in large part because this class of protein has been designed by nature to be capable of achieving many functional properties that are desirable in a therapeutic, such as high-affinity binding to a diverse array of epitopes, high selectivity against off-target binding, long circulation half-life, and targeted interactions with the human immune system. However, developing antibody-derived therapeutics is challenging because nature did not design these proteins to be drugs, i.e., to meet requirements of manufacturing, including storage and distribution, and to be delivered exogenously to patients through safe and convenient devices. The biopharmaceutical industry invests substantial time and effort engineering natural antibodies to improve their developability, increasing time to the clinic.

During developability engineering, as a therapeutic progress through research to development and into the clinic, multiple analytical techniques are required for full characterization of the therapeutic to determine higher-order structure, performance in manufacturing processes, and the presence of any molecule-related impurities. Analytical techniques used for such purposes include, but are not limited to, size-exclusion chromatographyCitation2 (SEC), hydrophobic interaction chromatographyCitation3 (HIC), ion-exchange chromatographyCitation4 (IEX), gel- and capillary-electrophoresisCitation5 (CE), differential scanning calorimetryCitation6 (DSC), differential scanning fluorimetryCitation7 (DSF), Fourier transform infrared spectroscopy (FT-IR), Fourier transform Raman spectroscopyCitation8 (FT-Raman), protein nuclear magnetic resonance (NMR) spectroscopy,Citation9 circular dichroismCitation10 (CD), dynamic light scatteringCitation11 (DLS), rheology,Citation12 mass spectrometryCitation13 (MS), and tandem mass spectrometry (MS/MS).Citation14 These methods are time-consuming since they are often manual and low throughput, and rely on the production and purification of recombinant protein samples. Furthermore, the total mass of protein required to build a full assay package can be substantial.

The biopharmaceutical industry is therefore strongly motivated to develop technologies that enable faster and more efficient therapeutic protein engineering. One approach is to build miniaturized and high(er) throughput versions of analytical techniques, such as those described above, and use the measurements of molecule attributes such as colloidal, physical, chemical, and thermal stability to predict the developability of the therapeutic. For example, in 2017 Jain et al.Citation15 used variants of 137 antibodies that are currently in the advanced stages of clinical trials, including 48 clinically approved as therapeutics, to develop metrics for antibody developability and to define boundaries for drug-like behaviors, with the ultimate goal of creating practical guidelines for future mAb-based therapeutic candidates. In 2020, Bailly et al.Citation16 described a series of biophysical property assays for 152 humanized mAbs, over two case studies, concluding that physicochemical properties and key assay endpoints correlated with key downstream process parameters. They report workflows that demonstrate effective rank ordering and elimination of sub-optimal mAbs and that enable further engineering of problematic sequence attributes without affecting program timelines. Also in 2020, Kingsbury et al.Citation17 presented a set of 59 mAbs (including 43 approved mAbs) where poor solution behavior is reliably predicted (>90%) by the measured diffusion interaction parameter (kD). Additionally, it was demonstrated that mAbs in this panel with high positive charges and isoelectric point (pI) values presented no pharmacokinetic (PK) disadvantages in humans.

Such higher throughput analytical approaches to therapeutic developability characterization are advantageous because they consume less time and resource per candidate; however, these approaches do still rely on time-consuming recombinant protein production and analytical measurements. An alternative approach is to computationally predict “expensive” attribute properties (i.e., those that are time- and resource-intensive to measure) using inputs that are easier to obtain, such as protein sequence and/or predicted protein structure. Computational surrogates have been developed for expensive developability attributes such as chemical liabilities (reviewed in ref),Citation18 colloidal stability, and fast human clearance, and have included: 1) physics-based prediction,Citation19–23 2) statistical models,Citation24,Citation25 and 3) hybrid approaches.Citation26 More recently, there has been intense interest in applying modern machine learning methods to the prediction of protein function from sequence (for review, see ref).Citation27 In the developability space, a priority of several teamsCitation28–30 has been prediction of high-concentration viscosity, which is tractable (albeit expensive) to measure in vitro, and therefore well suited to an approach in which high-quality, fit-for-purpose data sets are built for model training.

Herein, we describe a thorough analytical assessment of 83 mAbs in a uniform effector functionless IgG1 isotype.Citation31 All molecules were produced using standard antibody expression and purification platforms and were assessed with a diverse range of both high- and low-throughput analytical methods for colloidal, physical, chemical, and thermal stability. Furthermore, to complement our use of computational methods for structure prediction, we separately produced the fragment antigen binding (Fab) domains of a subset of 23 of these mAbs and solved the Fab X-ray crystal structures. We describe our use of this comprehensive analytical data package to train models that can predict viscosity and mouse in vivo clearance, two important but expensive properties relevant to the development of biologics. Our results demonstrate that effective models for both properties can be trained using high-throughput assay data alone, or a combination of assay data and predicted structure, an approach that is aligned with industry-wide goals to replace, reduce, and refine the use of animals in research according to the 3Rs AlternativesCitation32 and the recent Food and Drug Administration Modernization Act 2.0. The proof-of-concept models reported here are representative of the kinds of tools that can be created from carefully designed and curated data sets such as the one that is the focus of the work presented here. Other analyses using these data are reported in companion papers (see Discussion section for additional information).

Results

Molecule and assay selection

Machine learning approaches rely on data sets of sufficient size and quality to build accurate and robust predictive models. While we had access to proprietary internal data from many years of biologic therapeutic campaigns, we noticed that many of the most resource-intensive properties to obtain, i.e., those that are measured in later stages of the therapeutic pipeline, were biased in our existing data sets toward “good” behavior. This is not surprising; by design, proteins with even hints of biophysical liabilities are aggressively filtered out of the pipeline in the early stages of discovery. To remedy this scarcity of negative data from poorly behaved molecules in our internal databases, we decided to collect a comprehensive data package on a new panel of antibodies, and we designed that panel to have a balanced representation of biophysical properties of interest, i.e., to include many molecules with “poor” biophysical behavior. For example, because a major focus of the panel design was viscosity at high concentration (≥150 mg/mL), we searched our databases and the literature to ensure we included at least 20% of molecules with known viscosity >15 cP at 150 mg/mL.

To ensure robustness of our data package linking antibody sequence and physiochemical properties to developability, we produced our molecules in parallel, in the same Fc isotype, and in the same expression and purification platform. We selected 83 antibodies targeting 33 antigens, all in the IgG1z stable effector functionless (SEFL)2.2 isotype, an aglycosylated IgG1 of the G1m17 allotype that has been engineered with an N297G mutation to reduce binding to Fcγ receptors and with R292C and V302C mutations to improve thermostability through an engineered disulfide bond. Our panel was sequence-diverse, representing 66 unique VH + VL clades when clustered by the unweighted pair group method with arithmetic mean (UPGMA) using a transversal branch length of 0.1. The antibodies were produced in Chinese hamster ovary (CHO) cells and purified using highly consistent methodology to generate at least 200 mg to support a comprehensive evaluation of each protein lot. Further, for a subset of 23 of these antibodies we also expressed and purified the Fab domain alone and used these Fabs to solve X-ray crystal structures as a complement to in silico structure prediction methods.

We measured critical developability attributes of each antibody using a battery of analytical and biophysical assays (), including several that required concentration to 70 and 150 mg/mL, as well as mouse in vivo PK studies at 2 mg/kg. We report the results of all these assays in one comprehensive table (Table S1); each column is a value measured or calculated from a physical experiment on the same purified protein lot of each antibody. Table S2 reports the meta data describing how the value in each column of Table S1 was obtained (i.e., experimental conditions and data analysis calculations).

Table 1. Data collected on the antibody panel.

Correlations between recorded attributes

A total of 106 developability attributes were recorded. Table S3 contains all pairwise Spearman correlations among the attributes. The inter-assay correlations for a representative subset of 20 assays are presented in . As expected, the strongest correlations are between assays that measure similar properties. For example, across different temperatures, concentrations, and durations, the correlations in aggregation, as measured by size exclusion ultra-high performance liquid chromatography (SE-UHPLC), range from ρ=0.74 to 1.0. Additional strong correlations include those between the temperature at the onset of aggregation (Tagg) and the melting temperature (Tm1) at 70 mg/mL (ρ=0.76;p<0.001). Finally, we observe a correlation between DLS diffusion interaction parameter (kD) and polyethylene glycol (PEG) precipitation (ρ=0.62;p<0.001). Such correlations may present opportunities to reduce resourcing demands by eliminating redundant assays or employing targeted high throughput (HT) assays. Interestingly, correlations are also observed across assays that are putatively describing nonspecific interactions of different biophysical character. For example, between the poly-D-Lysine assay score and the membrane prep assay score (ρ=0.81;p<0.001) and between the poly-D-Lysine assay score and the polyethylene imine (PEI) score (ρ=0.76;p<0.001).

Figure 1. Pairwise Spearman correlations for a selected set of in vitro assays. See table S3 for the complete set of pairwise correlations.

A visual cross-correlation matrix for 20 different assays in which the strength of the correlation is proportional to the size of a circle in the matrix and the direction of the correlation is represented by color on a blue-white-red scale with dark blue = perfect positive correlation, dark red = perfect negative correlation, and white = no correlation. The matrix diagonal (self-correlation) is shown as large blue circles; most other squares show small circles, with a pocket of larger blue circles in assays related to charge (heparin chromatography, zeta potential, poly-D-Lys score, PEI score).
Figure 1. Pairwise Spearman correlations for a selected set of in vitro assays. See table S3 for the complete set of pairwise correlations.

A significant anti-correlation is seen between viscosity measured near 150 mg/mL and DLS kD (ρ=0.61;p<0.001). Weaker, but still significant, correlations are seen between viscosity measured near 150 mg/mL and heparin chromatography parameters (ρ0.43;p<0.001). The magnitude of such correlations may not be sufficient to eliminate the need for resource-intensive assays altogether, but may provide some predictive value when combined with other parameters in the context of a machine learning model.

Protein production and quality metrics

The 83 antibodies were expressed in either CHO-K1 or CHO Z2A4–18 cells with titers ranging from 60 mg/L to 2.5 g/L. All antibodies were purified using identical processes that included Fc affinity capture by protein A (ProA) followed by cation exchange chromatography (CEX). Most exhibited CEX elution profiles consistent with mAbs (Figure S1A), but six molecules showed an increase in high molecular weight (HMW) species (Ab54, Ab2, Ab42, Ab51, Ab55, Ab62) that could be separated away from the monomeric target peak (Figure S1B). One molecule (Ab82) separated into two, poorly resolved species (Figure S1C). Each peak was pooled separately, but both were indistinguishable by microcapillary analysis, analytical size exclusion-chromatography (aSEC) and mass spectrometry. Production of this molecule was repeated and demonstrated similar behavior, suggesting this behavior was molecule-specific. The earlier eluting peak produced more material and was selected for downstream analytics. All antibodies were formulated into a consistent buffer (10 mM sodium acetate, 9% w/v sucrose, pH 5.2). Overall purification productivity varied significantly and ranged from 0.015 to 1.6 g/L after two-column purification, in line with our expectations for a panel selected to have a wide range of developability profiles.

Antibodies were purified to high starting aSEC purity; representative aSEC profiles are shown in Figure S2A and representative profiles for non-reduced and reduced MCE are shown in Figure S2B and S2C, respectively. At t = 0, all purified antibodies had >97% main peak, good peak asymmetry, and all but two had high molecular weight (HMW) <2%. One of these two antibodies, Ab60, initially passed our quality control with >99% aSEC main peak. We later discovered that this main peak did not represent soluble antibody monomer, but rather an oligomeric species that was consistent with a trimeric species (481 kDa) determined by size exclusion chromatography-multi-angle light scattering (SEC-MALS) (Figure S3). The assay data collected on this soluble oligomer are included in our data tables, but were not used for model building. By non-reduced MCE (nrMCE), 80 antibodies had main peak purity >95% while three molecules (Ab54, Ab51, Ab81) had nrMCE main peak purity between 87% and 92%, primarily due to lower molecular weight contaminants. Further analysis showed that two molecules (Ab54, Ab51) had clips in the heavy chain at G104/T105 and R56/S57, respectively. These clips were confirmed by comparison of non-reduced and reduced MCE results (data not shown) with intact mass spectrometry in Figure S4 for Ab54 and Figure S5 for Ab51. We also produced an 84th antibody (Ab84) that contained an unpaired cysteine conjugation handle in the constant heavy 2 (CH2) domain, but eliminated this molecule from our developability analyses to focus on molecules of a uniform SEFL2.2 Fc scaffold.

For 23 of these antibodies, we also separately expressed the Fab domains for crystallography in CHO-K1, with titers ranging from 33 mg/L to 1.26 g/L. All Fabs were captured with anti-constant heavy 1 (CH1) affinity resins (CaptureSelect CH1-XL, KanCapG) followed by cation exchange chromatography (CEX). All Fabs were formulated into a consistent buffer (10 mM sodium acetate, 9% w/v sucrose, pH 5.2). Overall purification productivity varied significantly and ranged from 0.013 to 0.270 g/L after two-column purification. Fabs were purified to high aSEC purity; final lots were all >97% main peak with good peak asymmetry. By non-reduced MCE (nrMCE), all Fabs had main peak purity >95% (data not shown).

To provide confirmation of primary sequence (amino acid composition) reversed phase liquid chromatographic mass spectrometry (rpLC-MS) was performed to provide accurate molecular weight (Mw) measurement of each mAb. The rpLC-MS method described herein is based on the method described by Dillon et al.Citation33,Citation34 A representative rpLC-MS total ion chromatogram (Figure S6A), unprocessed (Figure S6B), and deconvolved (Figure S6B) neutral Mw spectra are shown in the supporting information. The rpLC-MS analyses of our panel of aglycosylated proteins did not identify any significant differences in the levels of post-translational modifications (glycation, lysine hydroxylation, incomplete signal peptide processing, incomplete C-terminal lysine processing) or differences in product quality between proteins expressed in the two different CHO hosts.

Structure prediction and comparison with X-ray crystal structures

Obtaining experimental structures of thousands of antibodies in the early discovery pipeline would be resource intensive and, more importantly, add significant time to each drug discovery project. Our preference was to build a workflow that is used in silico-predicted structures to enable developability prediction without requiring experimental structures. We predicted Fab structures of all 83 antibodies using Amgen’s internal high-throughput homology modeling pipeline, which was built using the Molecular Operating Environment (MOE)Citation35 software package from Chemical Computing Group. We used these homology models to calculate, in MOE, hundreds of structure-based features to support model building.

It was important to us to verify the feasibility of relying on features calculated from predicted structures rather than solving experimental structures, and we initiated a large-scale crystallography effort to evaluate the accuracy of modeling solutions. To assess the worst-case scenario, we focused our structure work on antibodies for which we expected the least reliable structure prediction, either because we observed long CDR-H3 regions or high sequence divergence from available template structures used for homology modeling. We solved the crystal structures of 23 Fabs to at least 2.8 Å and compared relevant regions of each structure, including fragment variable (Fv) and each complementarity-determining region (CDR) of the heavy chain and the light chain (CDR-H1, CDR-H2, CDR-H3, CDR-L1, CDR-L2 and CDR-L3), to corresponding structural models that were generated by several structure prediction methods (without any reference to the solved structural coordinates). In addition to our internal homology modeling pipeline (MOE at pH 5), we also generated structures of these 23 Fabs using MOE at pH 6 and pH 7, Discovery Studio,Citation36 Maestro,Citation37 and DeepAb.Citation38

The performance of the modeling tools was assessed by measuring root-mean-square deviation (RMSD) between the predicted structures and experimental crystal structure data and then calculating the percentage of structure features with RMSD < 2.0 Å. The CDR-L2 for Ab73 was omitted from calculations because this region was unmodeled in the crystal structure due to lack of electron density. compares this performance for each homology modeling method across antibody regions including Fv (panel A), VH (panel B), VL (panel C), CDR-H1 (panel D), CDR-H2 (panel E), CDR-H3 (panel F), CDR-L1 (panel G), CDR-L2 (panel H), and CDR-L3 (panel I). All experimental regions except CDR-H3 indicate a high degree of alignment with predicted models. The percent of structural features in each region with RMSD < 2.0 Å across all models is reported in Table S4. As expected, the largest RMSD was observed for CDR-H3, with the percentage of the CDR-H3 loops with RMSD < 2.0 Å ranging from 50% (MOE pH 6) to 71% (DeepAb), in line with our expectations for these antibodies, which have variable CDR-H3 loop lengths and diverse sequences. In general, however, the average RMSD of structural models for CDR-H3 for these challenging sequences was near ~2 Å (DeepAb = 1.58 Å, Discovery Studio = 1.88 Å, Maestro = 2.05 Å, MOE pH 5 = 2.09 Å, MOE pH 6 = 2.10 Å, MOE pH 7 = 2.10 Å), and we concluded there were no major risks in using features calculated from predicted solution structures rather than from an experimental crystal structure. In evaluating whether structural features did indeed provide value during property prediction, e.g., for viscosity, we used predicted structures from our automated internal structure prediction pipeline.

Figure 2. Homology modeling pipeline accuracy measured across all 23 experimental crystal structures.

Nine panels with six colorful curves each that rise from left to right to demonstrate the accuracy of structure prediction for different antibody regions (6 CDRs, the full Fv, VH, and VL). The brown DeepAb curve is generally furthest left in each panel, indicating higher accuracy across all antibody regions.
Figure 2. Homology modeling pipeline accuracy measured across all 23 experimental crystal structures.

Thermal stability

Thermal stability is commonly used to assess the stability of protein therapeutics.Citation39 There are multiple techniques available, varying in mechanism of detection and assay throughput. DSC has been considered the gold standard to determine Tm, but this technique is not high throughput. DSC requires a moderate amount of material (typically 500 μL, formulated at a concentration of 1 mg/mL), and the analysis time is generally 1–2 h per sample.Citation39 For this work, we decided to use the higher-throughput nano-differential scanning fluorimetry (nanoDSF) method since 48 samples can be analyzed in parallel within 2 h, and only 10 μL is required per sample. NanoDSF was used to determine the temperature at the onset of unfolding (Tonset), the temperature at the midpoint of first melting transition (Tm1), and the Tagg for the 83 mAbs at 1 mg/mL and 70 mg/mL.

In this panel, the Tm1 was >65°C for all but three mAbs (Table S1). Tm1 showed good agreement when measured at 1 mg/mL vs 70 mg/mL (Figure S7A; R2 = 0.929), suggesting concentration did not affect the thermal melting as detected by nanoDSF. Onset of unfolding was also similar between 1 mg/mL vs 70 mg/mL (Figure S7B; R2 = 0.893), also suggesting that concentration has a low impact when measuring thermal melting dynamics by nanoDSF. Tagg at 1 mg/mL vs 70 mg/mL did show significant differences, with 50 of the 83 mAbs not having a measurable Tagg at 1 mg/mL but all but one of the 83 mAbs having measurable Tagg at the higher concentration, 70 mg/mL. Comparing the 33 molecules that had a measurable Tagg at 1 and 70 mg/mL, we observed generally higher Tagg at 1 mg/mL and a lack of good correlation (Figure S7C; R2 = 0.343). Evaluating the aggregation dynamics by comparing the Tonset, Tm1, and Tagg for the set of samples at 70 mg/mL shows the majority of the Tagg temperatures are higher than the Tonset temperatures, suggesting aggregation is not occurring until the molecules are unfolding (Table S5). Further, our results indicate that thermally induced aggregation is concentration-dependent, and high-concentration behavior cannot be reliably extrapolated from low-concentration nanoDSF measurements.

Sequence related chemical liabilities

Chemical transformation of amino acid residues in a protein therapeutic, either during storage or in vivo, can affect safety and efficacy.Citation40 Therefore, candidate molecule amino acids that are prone to modification (also known as “hotspots”) are often targeted for removal through protein engineering. Accurate prediction of hotspots from sequence motifs and structural context would enable more focused engineering and faster therapeutic development.Citation41 To develop a complete picture of the modification landscape across the panel of 83 antibodies, we used a hotspot-centric approach to the data. In brief, amino acid sequences of all 83 molecules were scanned computationally to generate a list of all possible theoretical methionine oxidation, tryptophan oxidation, aspartate isomerization, and asparagine deamidation sites. We then performed proteolytic digestion with trypsin followed by reversed phase liquid chromatography tandem mass spectrometry (rpLC-MS/MS) of all 83 antibodies under 3 conditions: time zero (T0, no stress), photo stress of 192 klux*hr visible light, and after incubation at 40°C for 4 weeks. Once the data for all 249 rpLC-MS/MS analyses were collected, we used Mass AnalyzerCitation42 (software developed in house at Amgen) and Byos Software (Protein Metrics by Dotmatics, Cupertino, CA) to process and interpret the raw rpLC-MS/MS data, and coupled this with manual spot-checking of random data points as well as all data points with high modification values using SkylineCitation43 software. This analysis yielded a large data set with over 13,000 data points, a detailed discussion of which will be the topic of a future publication.

We focus here on a comparison of our results for deamidation and isomerization modifications, both the sites and the levels, to what has been reported previously in the literature. We define both deamidation and isomerization events as being modifications of >/ = 2% after 4 weeks at 40°C and summarize events versus sequence motif in Table S6. It has been well established that the frequency of both types of modification is highly dependent on the subsequent residue in the sequence. In our deamidation data, we found the highest frequency of deamidation events (33%) in NG motifs, followed by NH (25%), NS (19%), then NA (17%) (). This is largely in line with observations from Lu et al.,Citation44 who report the top three most frequently modified motifs to be NG>NH>NS. However, they did not detect any modification at NA sites and instead found deamidation at low frequency in NT, NN, and very low frequency at NF, NY, NQ, and NW motifs. Sydow et al.,Citation45 on the other hand, provide additional confirmation for NG as the most frequently modified motif, followed by NT, NS, and NN. It is important to note that the stress conditions applied in each of these studies are different; as a result, though the rankings are roughly aligned, the overall modification levels and detection frequencies for each motif vary considerably from one study to the next.

Figure 3. Deamidation (a) and isomerization (b) rates for each indicated motif. Fractions indicate the # of sites with >/ = 2% modification after 4 weeks at 40°C over total number of CDR sites with coverage in the peptide mapping data.

Two greyscale bar charts showing number of sites with either deamidation or isomerization by CDR sequence motif. Three to four tall bars are clustered to the left in both plots with the rest of the positions at baseline, indicating a small number of sequence motifs account for most of the chemical modifications observed.
Figure 3. Deamidation (a) and isomerization (b) rates for each indicated motif. Fractions indicate the # of sites with >/ = 2% modification after 4 weeks at 40°C over total number of CDR sites with coverage in the peptide mapping data.

While our deamidation data are in reasonable alignment with the prior literature, our isomerization data are not. This discrepancy is likely a result of the extremely low number of detected isomerization events in our set of antibodies. In fact, we only detected 3 CDR isomerization events at >/ = 2% modification across all 83 mAbs, with one event each occurring at DG, DN, and DT motifs (). This means that the relative prevalence of isomerization at each motif in our data is entirely dictated by the number of that motif in the set. We had 24 DG, 16 DT, and only 4 DN sites with coverage in the peptide mapping data resulting in calculated modification rates of 4%, 6%, and 25% for each motif, respectively. Based on this low level of detection, we do not believe these rates are likely to be broadly representative of isomerization rates for these motifs in antibody CDRs. A detailed examination of the sequences containing DG motifs in these molecules reveals 21 of the 24 CDR DG sites with acceptable rpLC-MS/MS data are in conserved germline positions. Further, 17 of those are in CDR-H2 where Lu et al. had previously observed a decreased incidence of DG isomerization.Citation44 None of these evolutionarily conserved germline positions modify to greater than our 2% cutoff, and this likely played a role in this low frequency of isomerization at DG sites we detected. For comparison, all 4 DN sites are non-germline motifs and all are located at different structural positions based on the AHo numbering system.Citation46 It is worth noting here that the mAbs in this set were not chosen specifically for diversity in their deamidation and isomerization sites, but rather for diversity in their biophysical properties.

Viscosity prediction Model

High viscosity at antibody concentrations >100 mg/mL limits developability by reducing the drug product delivery design space for subcutaneous injection and often results in larger dose volumes or the need for additional engineering during manufacturing.Citation12 The viscosity of antibody solutions is non-linear with concentration and can vary widely, even in the same formulation buffer, depending on protein sequence. Further, high concentration viscosity is one of the most “expensive” molecular attributes to measure because a large protein mass is needed to reach high concentrations in measurable volumes. Predicting or lowering the cost of this molecular attribute was therefore one of our highest priorities.

In addition to the industry standard cone-and-plate measurement of viscosity at 150 ± 10 mg/mL, we designed our assay package to include multiple assays and analytics that have been reported to be useful for predicting high concentration colloidal stability. We prioritized assays and measurements that are high throughput and require small amounts of protein. Based on internal experience and prior literature reports, we conducted affinity chromatography self-interaction nanoparticle spectroscopy (AC-SINS)Citation47–49 and PEG precipitationCitation50,Citation51 assays, and we measured zeta potentialCitation50,Citation51 and kD by DLS.Citation50,Citation52–56

We trained binary classifiers for predicting viscosity using a threshold of 15 cP at a concentration of 150 +/10 mg/mL to mark the boundary between “low” and “high” viscosity molecules. Twelve molecules were excluded from model development because the classification of their viscosity at 150 +/10 mg/mL was ambiguous. Specifically, molecules were excluded if the measurement concentration was <140 mg/mL with viscosity <15 cP and were similarly excluded if the measurement concentration was >160 mg/mL with viscosity ≥15 cP. Of the 71 molecules used to train and evaluate the viscosity prediction model, 43 were low viscosity and 28 were high viscosity.

Two categories of features, structure-based and HT experimental (i.e., AC-SINS, DLS, and PEG precipitation), were constructed. All features are numeric, except AC-SINS Risk Bin, which is categorical. Linear and non-linear classifiers were evaluated, but the non-linear classifiers show evidence of significant overfitting (due to the limited number of training instances) and are thus eliminated from further discussion. Details of the molecule selection criteria, feature construction, and model training are presented in the Methods section.

Feature relevance

Prior to model training, we computed the Spearman’s rank order correlation between each feature and the real-valued viscosity measurement (in cP) to assess relevance. reports the top features in each category that exhibit correlations with measured viscosity of magnitude 0.4. Spearman correlations between all features considered during model development and the measured viscosities are presented in Table S7. We note that the correlations were not used during model development and are presented here to provide a sense for which feature categories provide the most information about high concentration viscosity.

Table 2. Spearman rank correlations of the top features in each category that achieve correlations with measured viscosity of magnitude 0.4. Structure-based feature names are as defined by the MOE software package. All p-values are <0.001.

The top feature is the DLS interaction parameter, kD (ρ=0.64;p<0.001), but there are substantially more structure-based features that achieve correlations ρ0.4(p<0.001), and the difference in average correlation between HT and structure-based features is minimal. We conclude that each feature category contains information relevant to viscosity, and so should be included during model development.

We also analyzed the correlation between HT experimental assays and structural features computed using MOE (Table S7). The DLS Interaction Parameter kD showed strong positive correlation (≥0.8) to apparent charge, Fab and Fv net charge, protein dipole moment and mobility, zeta potential, and pI. Experimentally measured zeta potential showed moderate correlation (~0.5) to pI and CDR and Fv net charge. On the other hand, AC-SINS related attributes showed weak correlation to the calculated structural features, indicating that AC-SINS is orthogonal to the structural features and provides information not captured by them. Hence, AC-SINS-related attributes (AC-SINS DiffC (µmCitation2/s) and AC-SINS λmax (nm)) rank very highly in the best-performing ML model as shown by the Shapley analysis (Figure S8).

Model performance

We evaluated both linear and non-linear classification models trained using traditional machine learning methods, as opposed to deep learning methods, which would require a much larger dataset. As expected, non-linear models showed evidence of overfitting our modestly sized data set (71 molecules), so we are presenting the results of performing Logistic Regression, a simple linear classifier. Elastic net regularization was used during training the model to further reduce overfitting. We trained multiple Logistic Regression models using various combinations of features. reports median test metrics (Balanced Accuracy, Area Under the Curve (AUC), Precision, Recall, Matthews Correlation Coefficient, and Positive Likelihood Ratio (PLR)) after performing 100 random stratified train-test splits. Boxplots of the metrics for all 100 splits are provided in Figure S9. As expected, models trained on HT in vitro assay features outperform those trained only on structure-based features. Combining HT features with structure-based features increases performance over HT alone, suggesting that the different feature categories capture different aspects of viscosity. Figure S8 shows the relative importance of the features used in the model trained on HT and structure-based features, as computed using Shapley values. We found that AC-SINS λmax is the most important feature, followed by the number of positively charged patches, and the protein dipole moment.

Table 3. Median test metrics of viscosity binary classifiers. One hundred random stratified train-test splits were performed for various feature combinations. Median test metrics of logistic regression models trained with an elastic net regularization penalty applied are reported. HT = high throughput experimental features (AC-SINS, DLS, and PEG precipitation). Structure = 112 structure-based features derived from structural models. AUC = area under the receiver-operator curve. MCC = Matthews correlation coefficient. PLR = positive likelihood ratio (true positive rate/false positive rate). The confusion matrix for the best model (HT + Structure) is as follows: true positives = 40. True negatives = 23. False positives = 5. False negatives = 3. Here, ‘positive’ refers to a molecule with low viscosity (<15 cP @ 150 mg/ml).

Pharmacokinetics prediction model

Many attributes of a biologic therapeutic affect its clearance in vivo, including both specific and nonspecific interactions with proteins and tissues in the body. We considered whether our panel of antibodies, with uniform IgG1 Fc isotype and sequence-diverse Fv domains that bind to 33 distinct targets, would be useful to build tools to predict aberrantly high nonspecific clearance, in the absence of immunogenicity or specific effects such as target-mediated drug disposition (TMDD). To focus on the impact of physiochemical properties on clearance in our analysis, we eliminated 19 mAbs with mouse cross reactivity and 8 mAbs with observed anti-drug antibodies in mouse.

Based on our prior experience and reports in the literature,Citation57–59 several of the assays and measurements in our data package could potentially be useful for predicting the impact of physiochemical properties on in vivo clearance, including 1) charge (zeta potential),Citation60–62 2) self-interaction assay (AC-SINS),Citation63 and 3) cross-interaction assays to a variety of substrates,Citation64 including those of hydrophobic character (hydrophobic interaction chromatography, membrane preparation enzyme-linked immunosorbent assay (ELISA)), negatively charged character (heparin) chromatography,Citation65 positively charged character (poly-D-lysine ELISA, PEI ELISA), or mixed character (baculovirus particle (BVP) ELISA).Citation66 We also measured the interaction of each mAb with the neonatal Fc receptor (FcRn), through both an AlphaScreen binding assayCitation67 and FcRn chromatography,Citation68 to assess the impact of diverse Fab sequences on FcRn engagement by our invariant Fc sequence.

We trained binary classifiers for predicting PK AUCt using a threshold of 3.9 × 106 h x ng/mL to mark the boundary between “low” and “high” clearance molecules. Of the 55 molecules used to train and evaluate the PK clearance model, 50 were low clearance and 5 were high clearance. Two categories of features were constructed: structure-based and experimental (Heparin Chromatography, DLS, Zeta Potential, Ammonium Sulfate Precipitation, PEG Precipitation, BVP, Membrane Prep Assay, FcRn Chromatography, SE-UHPLC, Poly-D-Lysine Assay, PEI Assay, HIC, DSF, Thermolysin, Sepax, and AC-SINS). All features are numeric. As with the viscosity prediction model, linear and non-linear classifiers were evaluated, but the non-linear classifiers show evidence of significant overfitting and are thus eliminated from further discussion. Details of the molecule selection criteria, feature construction, and model training are presented in the Methods section.

Feature relevance

We computed the Spearman’s rank order correlation between each feature and the real-valued AUCt measurement (in h x ng/mL) to assess relevance. reports the top features in each category that achieve correlations with measured AUCt of magnitude ρ0.4. Spearman correlations between all features used in the model and the measured AUCt are presented in Table S8.

Table 4. Spearman rank correlations of the top features in each category that achieve correlations with measured PK AUCt of magnitude 0.4. Structure-based feature names are as defined by the MOE software package. All p values are <0.003.

The top feature is heparin chromatography elution (percentage buffer B at elution peak) (ρ=0.48;p<.001), but there are several structure-based features that exhibit correlations ρ ≥0.4 (p < .003), and the difference in the average correlations between experimental and the structure features are modest. We conclude that each feature category contains information relevant to PK clearance, and so should be included during model development.

Model performance

We followed the same model development workflow used to train the viscosity model. reports median test metrics, after performing 100 random stratified train-test splits. Boxplots of the metrics for all 100 splits are provided in Figure S9. Unlike the viscosity model, models trained on structure-based features perform the best across the majority of the metrics considered, outperforming those trained only on in vitro assays with respect to AUC, Precision, Mathew’s Correlation Coefficient (MCC), and PLR. The model combining structure-based features with experimental features has similar performance to the one with structure-based features alone, suggesting that structure features were capable of capturing most relevant aspects of PK Clearance in this data set. However, the data used to train this model are both highly imbalanced (50 low clearance vs 5 high clearance) and small (55 molecules), so this empirical observation should not be taken as definitive. Figure S10 shows the relative importance of the features used in the model trained on structure-based features, as computed by Shapley values. We found that the number of positive patches is the most important feature to the model, followed by the water-accessible surface area of a protein and number of hydrophobic patches, and magnitude of the dipole moment of the zeta potential.

Table 5. Median test metrics of PK clearance binary classifiers. One hundred random stratified train-test splits were performed for various feature combinations. Median test metrics of logistic regression models trained with an elastic net regularization penalty applied are reported. Experimental = (heparin chromatography, DLS, zeta potential, ammonium sulfate precipitation, PEG precipitation, BVP, membrane Prep assay, poly-D-Lysine assay, PEI assay, HIC, DSF, SEPAX, and AC-SINS). Structure = 112 structure-based features derived from structural models. AUC = area under the receiver-operator curve. MCC = Matthews correlation coefficient. PLR = positive likelihood ratio (true positive rate/false positive rate). The confusion matrix for the best model (structure) is as follows: true positives = 3. True negatives = 49. False positives = 1. False negatives = 2. Here, ‘positive’ refers to a molecule with high clearance ( 3.9 × 10Citation6 h x ng/mL).

Discussion

The biopharmaceutical industry is driven to reduce cycle times and increase therapeutic campaign success rates to deliver more effectively for patients. One approach toward rapid and effective engineering of biologic therapeutics is the development of in silico models that can predict complex protein properties using data from HT assays, or better, using the protein sequence alone. We initiated our internal build of high-quality predictive models by dedicating resources to generation of a comprehensive dataset that was relevant to our internal antibody therapeutics development platform. In selecting molecules for our data set, we focused on standardizing and controlling factors that could result in “noise” (e.g., non-platform expression hosts, purification protocols, formulation buffers) while allowing for broad variability in the properties we desired to predict, e.g., viscosity. By designing a fit-for-purpose panel that was produced, formulated, and tested using our platform methods – instead of relying on a collection of published data – we aimed to deliver data with the highest internal utility for future internal prediction. Further, we avoided any property biases inherent in those molecules that “made it” to publication or to the clinic.

The package of HT empirical data that we collected in this effort included several assays that were not part of our classic molecule assessment (developability) platform, but were reported in the literature to be useful for property prediction. We internally implemented these new assays as a valuable exercise to avoid some of our own internal “platform” biases and to cast a wider net for protein analytics. In particular, we wanted to understand how different analytical or assay conditions or formulation buffers can lead to different results and influence models. Ultimately, like others in the field,Citation15–17 we were also mindful of selecting conditions that aligned with our internal processes to generate models that would be most relevant and applicable to our therapeutic platforms. As a result of this experimental design, some of the assays described here are redundant with others in our data set. Indeed, for us another valuable outcome of the work is that it enables design of an efficient and effective molecule assessment assay package that provides the highest internal predictive value with minimal resource expenditure, in addition to delivering the reported predictive models for viscosity and in vivo clearance. Importantly, it can also be tailored by the inclusion or exclusion of analytical assays, depending on the stage at which the analytical package is being applied, or the attribute of interest being measured.

A key question at the outset of the work was whether reliable prediction would require experimentally determined high-resolution protein structures or whether predicted protein structures would be sufficient. A pipeline that needs experimental structures for every variant would require substantial investment in new methods for HT structure determination. To our satisfaction, our results show that modern structure prediction methods predict most structural features of previously unknown Fabs with RMSD < 2 Å.

We are acutely aware that some of the results presented herein are different from those previously published on different mAbs. For example, our reported aspartate isomerization values are lower than those reported by Lu et al.Citation44 Our HIC retention times are also more extended than those reported by Jain et al.Citation15 Both attribute differences reflect differences in the protein sequence distribution as well as in methods and conditions, and such differences may be expected to impact modeling outcomes, highlighting the value of continuing to collect larger developability data sets with diverse protein sequences and biophysical properties. The effect of differences in assay solution conditions could be explicitly explored in future studies. We emphasize that the models presented here are merely illustrative of the kinds of predictive tools that might be learned from these data. In particular, the limited size of our dataset should be considered when interpreting these results. For example, the reported performance metrics may not reflect their performance on unseen data because we do not have a true hold-out set. Instead, we estimated the generalization performance by training 100 models for each endpoint using random stratified 70:30 splits and reported the median performance. Moreover, the models were trained on data from mAbs and so are not expected to generalize to other biologic modalities. Indeed, we have already collaborated to demonstrate the use of these data to build successful models beyond those reported here, including alternative viscosity prediction methodology with our academic collaborators (Makowski et al., Reduction of monoclonal antibody viscosity using interpretable machine learning, mAbs; submitted) and a physiologically based pharmacokinetic (PBPK) model to predict antibody PK in collaboration with internal experts in PK and drug metabolism (Liu et al., Utility of physiologically based pharmacokinetic modeling to predict inter-antibody variability in monoclonal antibody pharmacokinetics, mAbs; submitted), and we are open to further collaboration.

We will briefly highlight some differences between PK prediction model reported here and our later collaboration with Liu et al. In contrast with their work using 56 mAbs, the final panel reported here consisted of 55 mAbs, as mAb84 was removed from developability analyses as described above. We also included a more comprehensive set of experimental data and a structural and sequence-based evaluation as part of our computational strategies (viscosity prediction model and PK prediction model). Liu and coworkers identified and included an additional set of 14 mAbs, not included in this analysis. These 14 mAbs were evaluated for PK and heparin chromatography (only) to test the predictive utility of their developed PBPK model. Interestingly, the lead covariate (heparin chromatography assay metric) employed within the PBPK model was also found to be one of the top three most significant experimental features in the machine learning model. Despite differences in the statistical approaches and overall computational strategy, both investigations independently identified heparin chromatography as useful in predicting mAb PK, decreasing the need for extensive in vitro experimental data to de-risk antibody sequences in the early discovery phase.

Additionally, in this study, we evaluated charge-related calculated parameters including isoelectric point (pI) and various positive charge patch metrics, but none correlated directly to mAb clearance. Historically, the effect of varying pI on mAb PK is one of the most routinely studied physiochemical property – PK relationships. Previous investigations lack consensus; the relationship between mAb plasma clearance and pI charge was reported to be either monotonic,Citation69,Citation70 bell-shaped,Citation22,Citation40 or uncorrelated.Citation66 The results from the current study are consistent with the latter. Other reports have demonstrated that disrupting surface charge distribution without changing pI may also alter a mAb’s plasma exposure,Citation71,Citation72 or that reducing pI whilst more evenly distributing surface charge can significantly reduce clearance.Citation73 Positive charges on the mAb surface may interact with the negatively charged cell membrane leading to higher nonspecific tissue uptake.Citation74 Additionally, charge patches may have a detrimental effect on a mAb’s interaction with FcRn.Citation61 Clearly, there are many nuances to the complex nonspecific interaction between mAbs and the cell membrane, thus it is not surprising that no singular physiochemical property effectively captures the full consequence of this interaction on mAb plasma PK. We surmise that the experimental setup of heparin chromatography bears more similarity to this multi-faceted nonspecific tissue interaction occurring in vivo, and hence, this assay metric emerged as one of the leading features predictive of non-target-mediated mAb plasma clearance. The PK modeling results again emphasize that no single metric or assay is sufficient to predict outcomes, and a developability package that more comprehensively calculates and/or measures a broad scope of molecule features is necessary. In addition to the PK prediction model described above, the example of a viscosity classifier model reported here can be of practical utility in a drug discovery pipeline by implementation at the stage of selecting parental antibody binders for developability engineering. Without needing to measure or calculate a precise viscosity value, predicting with >80% accuracy which antibodies will be “high” viscosity enables both the selection of higher quality lead panels and investment of engineering resources where there are higher probabilities of successful clinical candidates. The model demonstrates the value of collecting high-quality, labeled training data, and we expect that both the scope and accuracy of our predictive property models will continue to improve as we add data to the training set when we make and test new protein sequences in our internal biologic therapeutics pipeline and enable more sophisticated modeling approaches. For example, we are well advanced in separate efforts to use internal data sets to build models to accurately predict viscosity of large molecule therapeutic modalities beyond standard mAbs, such as multispecific antibodies, antibody fusions, and non-antibody protein therapeutics, which will be the subject of a future publication.

In conclusion, the antibody developability data set that we are sharing here is valuable because of several features of experimental design, including 1) high primary protein sequence diversity with molecules that bridge to other reported studies, 2) standardized and quality-controlled protein production and formulation, 3) collection of data from a comprehensive set of in vitro assays, and 4) direct measurement of expensive outcomes including high-concentration colloidal stability and in vivo PK. We expect it will be useful to the community, as it has been to us, in efforts to build computational methods enabling more rapid delivery of protein therapeutics to patients.

Materials and methods

Protein production and quality control

Fifty-three antibodies were produced by the Bioprocessing Technology Institute (BTI, Singapore) and the remaining antibodies were produced at Amgen, Inc. Antibody heavy and light chains were cloned into a proprietary, dual-promoter bicistronic expression vector and transfected into CHO Z2A4–18 cells. Selections were performed with puromycin (20 µg/mL) prior to large-scale production. During production, feeds were done on days 3, 5, 7, 9, and 11, with additional glucose feeds on days 7 and 12 to maintain titer of 2.0–4.5 g/L. After 14 days, cell viability and viable cell density were measured, cells were removed by centrifugation, and conditioned media (CM) was filtered using 0.2 µm cellulose acetate filters. Twenty-eight antibodies were produced at Amgen by expression in stably transfected CHO-K1 cells. The heavy and light chains were cloned into pTT26 vectors with either puromycin or hygromycin resistance cassettes, respectively, and transfected into CHO-K1 cells. After 7 days, cell viability and viable cell density were measured, cells were removed by centrifugation, and CM was filtered using 0.2 µm cellulose acetate filters. CM titers were measured by ForteBio analysis using protein A sensor tips (Sartorius; Cat. #18–5010).

Clarified supernatants were affinity captured by MabSelect SuRe chromatography (Cytiva; Cat. #17543803), using Dulbecco’s phosphate-buffered saline (PBS) without divalent cations (Gibco; Cat. #14190367) as the wash buffer and 100 mM acetic acid, pH 3.6 as the elution buffer. All separations were carried out at ambient temperature. Peak fractionation was used to collect Protein A elutions and initiated when the absorbance at 280 nm was above 50 mAU and stopped when the absorbance was less than 50 miliAbsorbance Units (mAU). Elution pools were diluted with 5-volumes Milli-Q water, neutralized to approximately pH 5.0 using 1 M tris base, and filtered through a 0.22 μm cellulose acetate filter.

Antibodies were further polished using an SP sepharose high-performance column (Cytiva; Cat. #17108703) and washed with 5 column volumes of SP-Buffer A (20 mM sodium acetate, pH 5.0) followed by elution using a 40-column volume gradient to 100% SP-Buffer B (20 mM sodium acetate, 1 M sodium chloride (NaCl), pH 5.0). Peak fractionation was used to collect 4 mL fractions, starting when the absorbance at 280 nm reached 20 mAU, and ending when the absorbance dropped below that value. Pools were made based on the chromatogram, A280 concentration, LabChip GXII (Perkin Elmer), and analytical SE-HPLC analysis of fractions, and fractions that were ≥95% main peak by non-reduced MCE and analytical SE-HPLC were combined. The pools were diafiltered against approximately 30 volumes of 10 mM sodium acetate, 9% w/v sucrose, pH 5.2 using Slide-A-Lyzer dialysis cassettes with a 10 kDa cutoff membrane (Thermo Scientific; Cat. #66830) and further concentrated using Vivaspin-20 centrifugal concentrator with a 10 kDa cutoff membrane (Sartorius Stedim Biotech; Cat. #VS2001). The concentrated material was then filtered through a 0.8/0.2 μm cellulose acetate filter and the concentration was determined by the absorbance at 280 nm using the calculated extinction coefficient. Sample purity was determined by LabChip GXII analysis under reducing (with 32.7 mM tris (2-carboxyethyl) phosphine (TCEP)) and non-reducing (with 25 mM iodoacetamide) conditions. Analytical SEC was carried out using a BEH200 column (Waters; Cat. #186005226) with an isocratic elution in 100 mM sodium phosphate, 50 mM NaCl, 7.5% ethanol, pH 6.9 over 10 min.

Forty-one recombinant Fabs were produced at Amgen (Thousand Oaks, CA) and expressed in stably transfected CHO-K1 cells. Thirty-three recombinant Fabs were produced at the Syngene-Amgen Research and Development Center (SARC). The Fd and light chains were separately cloned into monocistronic vectors with puromycin or hygromycin resistance cassettes and transfected into CHO-K1 cells. After 7 days, cell viability and viable cell density were measured, cells were removed by centrifugation, and conditioned media (CM) was filtered using 0.2 µm cellulose acetate filters. CM titers were measured by ForteBio analysis using protein A sensor tips (Molecular Devices).

Clarified supernatants were affinity captured using either CaptureSelect CH1-XL (ThermoFisher Scientific; Cat. #194346201 L) or KanCapG chromatography (Kaneka; Cat. #KPG01-B025-R), using 25 mM tris, 100 mM NaCl, pH 7.4 as the wash buffer. Fourteen Fabs were eluted from CaptureSelect CH1-XL with 50 mM sodium acetate, pH 4.0, immediately conditioned with 3-volumes 50 mM sodium acetate, pH 5.0, and directly loaded onto an SP sepharose high-performance column (Cytiva; Cat. #17108703) for further polishing. Twenty-seven Fabs were eluted from KanCapG with 100 mM glycine, pH 3.0, and conditioned with 3-volumes 50 mM sodium acetate, pH 5.5. KanCapG elutions were loaded onto an SP sepharose high-performance column for further polishing.

All SP sepharose high-performance columns were washed with 5 column volumes of SP-Buffer A (50 mM sodium acetate, pH 5.0) followed by elution using a 20-column volume gradient to 60% SP-Buffer B (50 mM sodium acetate, 1 M NaCl, pH 5.0). Peak fractionation was used to collect fractions, starting when the absorbance at 280 nm reached 50 mAU, and ending when the absorbance dropped below that value. All separations were carried out at ambient temperature. Pools were made based on the chromatogram, A280 concentration, LabChip GXII (Perkin Elmer), and analytical SE-HPLC analysis of fractions, and fractions that were ≥95% main peak by non-reduced MCE and analytical SE-HPLC were combined. The pools were diafiltered against approximately 30 volumes of 10 mM sodium acetate, 9% w/v sucrose, pH 5.2 using Slide-A-Lyzer dialysis cassettes with a 10 kDa cutoff membrane and further concentrated to approximately 10 mg/mL using Vivaspin-20 centrifugal concentrator with a 10 kDa cutoff membrane. The concentrated material was then filtered through a 0.8/0.2 μm cellulose acetate filter and the concentration was determined by the absorbance at 280 nm using the calculated extinction coefficient. Sample purity was determined by LabChip GXII analysis under reducing (with 32.7 mM TCEP) and non-reducing (with 25 mM iodoacetamide) conditions. Analytical SEC was carried out using a BEH200 column with an isocratic elution in 100 mM sodium phosphate, 50 mM NaCl, 7.5% ethanol, pH 6.9 over 10 min.

Reversed phase liquid chromatographic mass spectrometry and mw determination

rpLC-MS method

All rpLC-MS data (to be compared to HT solid-phase extraction mass spectrometry (HT SPE-MS)) were acquired on an Agilent 6224 oaToF MS instrument hyphenated to a 1290II Infinity UHPLC system. Chromatographic separation was achieved using a 2.1 × 50 mm, Zorbax SB300, C8, 1.8 um, Agilent #857700–906 column, operated at a temperature of 70°C. The solvents used were as follows: mobile phase A was water containing 0.1% v/v trifluoracetic acid (TFA). Mobile phase B was 90% n-propanol containing 0.1% v/v TFA. Initial gradient conditions were 20% mobile phase B from 0.0 to 1.0 min; 1.0 to 9.0 min, 20–70% mobile phase B; 9.0–10.0 min, 70–100% mobile phase B, where it remains at 100% for one further minute. The flow rate was 0.4 mL/min. Approximately 15 μg of mAb was loaded onto the rpLC-MS system for each analysis. Data were acquired over the m/z range 1000–7000. The source fragmenter, skimmer and octapole 1 RF values were 460 V, 95 V and 800 V (peak-to-peak), respectively. The electrospray ionization (ESI) capillary voltage was 5.9 kV. Gas temperature was 340°C. Drying gas was 13 L/min. Nebulizer was 25 psig. Orthogonal acceleration-time of flight (Oa-ToF) calibration was performed using the Agilent Tune Mix using the automated calibration procedure implemented through MassHunter Data Acquisition, version B10.1.48.

Samples were received at a concentration of 70 mg/mL and approximately 12 µg was diluted into 30 µL of 8 M Guanidine hydrochloride (GuHCl) in 50 mM Tris at pH 7.5. The protein was further reduced with 10 mM dithiothreitol (DTT) at 50°C for 40 min and quenched with 10 μL of 1% v/v TFA before 25 μL was analyzed by rpLC-MS. For molecular weight (Mw) confirmation under non-reducing conditions, the samples were diluted to and analyzed at a concentration of 1 mg/mL with a solution of 0.1% v/v TFA and 6 μg of material was injected for analysis by rpLC-MS.

Deconvolution parameters

Deconvolution was performed using both Maximum Entropy within MassHunter and Protein Metrics Intact.Citation75 Parameters for both are described as follows. For MassHunter Maximum Entropy Deconvolution, the following parameters were used: Mass range 20,000–300,000 Da; Mass Step 0.5 Da; use limited m/z range 2000–6000; Adduct Proton; Isotope width Automatic. For Protein Metric Intact Deconvolution, the following parameters were used: Non-reduced sample deconvolved Mass range 20,000–300,000 Da; non-smart reduced sample m/z range 2000–6000; reduced sample deconvolved Mass range 10,000–100,000 Da; reduced sample m/z range 750–4000; Min difference between mass peaks (Da) 15, Peak sharpening disabled; Charge vectors spacing 0.6; Baseline radius (m/z) 15, Smoothing sigma (m/z) 0.02; Spacing (m/z) 0.04: Mass smoothing sigma 3; Mass spacing 0.5; Iteration max 20; Charge range 5–100.

Crystal structures

Prior to crystallization, all 23 Fabs were further purified via gel filtration using Superdex 75 (Cytiva 28,989,333 or 17,517,401) or Superdex 200 (Cytiva 28,989,336) columns (50 mM Tris, 150 mM NaCl) and screened with high-throughput crystallization screens including from Hampton Research PEG/Ion HT (HR2–139), PEGRx HT (HR2–086), Crystal Screen HT (HR2–130), and Index HT (HR2–134), and from Nextal Biotechnologies PEGs I (130904), PEGs II (130916), ProComplex (130915), MbClass I (130911), MbClass II (130912), JCSG+ (130920), JCSG I (130924), JCSGII (130925), JCSGIII (130926) and JCSGIV (130927). Crystal screens were performed with vapor diffusion in sitting drops at room temperature. Specific details regarding purification and crystal conditions for each Fab are available in Table S9. Synchrotron X-ray data were collected on all Fabs, which were determined via molecular replacement and further modeled and refined. Specific details regarding X-ray data collection, molecular replacement, and modeling and refinement are available in Table S9.

Structure prediction and alignment with crystal structures

Homology models for variable regions of all 83 mAbs were built using 4 different software packages. Fab models from all 83 sequences were built in MOE,Citation35 Discovery Studio,Citation36 and Maestro.Citation37 Fv models from all 83 sequences were built in DeepAb based on the method described in Ruffalo et al.Citation38 All predicted structures built for the purpose of comparison with experimental structures were constructed without any reference to experimental structure data on the same molecule. Structure comparisons between homology models and crystal structures were calculated by using the Kabsch algorithm to superimpose select atoms before calculating root mean square deviation (RMSD) values,Citation76 with pairwise atom identities handled by sequence alignment using MUltiple Sequence Comparison by Log-Expectation MUSCLECitation77 with manual alignment around sites involving missing density. Regions for alignment were defined using MOE.Citation35

Protein concentration and accelerated stability

Samples at ~70 mg/mL and ~150 mg/mL were prepared using Vivaspin-20 centrifugal concentrators with a 10 kDa cutoff membrane (Sartorius Stedim Biotech). The concentrated material was then filtered through a 0.8/0.2 μm cellulose acetate filter, and concentration was determined by the absorbance at 280 nm using the calculated extinction coefficient. Final concentrations for each molecule are reported in Table S1. Samples at 1 mg/mL were prepared by diluting the 70 mg/mL stocks into either 10 mM sodium acetate, 9% w/v sucrose, pH 5.2 or 10 mM sodium acetate, 150 mM sodium chloride, pH 5.2.

Multiple aliquots (100 μL) of the 70 mg/mL and 1 mg/mL samples were incubated at controlled temperatures of 5 ± 3°C in a Thermo Scientific refrigeration unit, at 25°C or 40°C in a Thermo Scientific Precision incubator. An aliquot was removed at each timepoint for analysis. For light stress, molecules aliquots were exposed to 192 klux-hr visible light using a Caron photostability test chamber.

Tm and Tagg

Thermal stability was characterized by running TrpShift studies on the Prometheus NT.48 nanoDSF (NanoTemper Technologies GmbH). Each sample (10 μL) was loaded into a glass capillary provided by the manufacturer and placed in the instrument. Thermal ramp was applied at 1.0°C/min with start temperature 25°C and stop temperature with 95°C. The laser excitation was 280 nm. Unfolding was measured by monitoring the fluorescence emission ratio 350 nm/330 nm. Onset of unfolding (Tonset) and melting temperature (Tm1) were determined using the PR.ThermControl v2.04 software (NanoTemper Technologies GmbH) for each molecule at both 1 mg/mL and at 70 mg/mL in formulation buffer (10 mM sodium acetate, 9% w/v sucrose, pH 5.2). The Tm is reported as the midpoint between the unfolding onset and the max unfolded state. The backscatter of the laser was measured simultaneously to determine the apparent onset of aggregation (Tagg). These values were derived automatically and directly from the first derivative of the raw unprocessed thermogram. Data was also visually inspected and the Tm and/or Tagg were adjusted manually, if required.

Viscosity measurement

Viscosity was determined using an Anton Paar MCR Rheometer by measuring the flow resistance due to the frictional forces between molecules. A flow sweep procedure was applied from 10 to 1000 1/s using a steel 20 mm Peltier plate with 1.988°Cone. Viscosity was measured in Pa-s, where 1 mPa-s = 1 cP at 1000 1/s. An aliquot of 80 μL was loaded onto the plate for each measurement. Viscosity was measured for each molecule at approximately 150 mg/mL with 0.01% Tween 80 surfactant added for a final formulation buffer of 10 mM sodium acetate, 9% w/v sucrose, 0.01% Tween 80, pH 5.2.

Zeta potential

Protein samples (~25 μL @ ~20 mg/mL) were loaded by a gel-loading pipette tip into the center of a disposable capillary cell (DTS 1070), which was prefilled with 10 mM sodium acetate, 9% w/v sucrose, pH 5.2. Zeta potential was then measured on a Zetasizer (Malvern Panalytical Ltd.), using default settings for alternative voltage and frequency. DLS measurements were also performed before and after the zeta potential measurement to ensure that no proteinaceous aggregates had formed during measurement.

Dynamic light scattering

The diffusion interaction parameter kD was determined using DLS as reported previously.Citation78 Briefly, mAb samples were serially diluted to concentrations of 2, 6, 10, and 20 mg/mL. The diluted solutions (120 μL) were transferred to a 96-well plate and the diffusion coefficient was determined using a DynaPro Plate Reader II (Wyatt Tech), with a 632 nm laser at 25°C. The diffusion coefficient was provided by the Dynamics software directly, and kD was calculated from the equation: D = D0 (1+kD*c), where D is the measured apparent diffusion coefficient at protein concentration c and D0 is the diffusion coefficient at infinite dilution.

Ammonium sulfate and polyethylene glycol precipitation

For PEG precipitation, solutions of 10 mM acetate, 9% w/v sucrose, pH 5.2 were prepared at varying concentrations of PEG-6000 (Sigma 81,260) from 0% to 40%. In a 96-well transparent Sensoplate microplate (Grenier Bio-One, 655892), mAb samples (70 mg/mL; 1.4 μL) were diluted with PEG solutions (100 μL) to a final mAb concentration of 1 mg/mL. The microplates were then incubated overnight at room temperature. After incubation, the wells were thoroughly mixed and turbidity was analyzed at 350 nm using a SpectraMax Plus (Molecular Devices) plate reader spectrophotometer. The PEG concentration at the midpoint of the precipitation curve was designated Conc50 (in %); values >40% were extrapolated from the curve.

For ammonium sulfate precipitation, a 4 M stock solution of ammonium sulfate was diluted using 10 mM acetate, 9% w/v sucrose, pH 5.2 to final concentrations ranging from 0.5 to 1.8 M. As above, mAb samples at 70 mg/mL were diluted in the ammonium sulfate solutions to a final mAb concentration of 1 mg/mL in a 96-well transparent Sensoplate microplate. After overnight incubation at room temperature, samples were thoroughly mixed and the optical density at 350 nm was measured as described previously. The ammonium sulfate concentration at the midpoint of the precipitation curve was defined as Conc50 (in M).

Standup monolayer adsorption chromatography

Standup monolayer adsorption chromatography was evaluated with a Sepax Unix SEC-300 column, 1.8 µm 300 Å 4.6 × 300 mm, p/n 211,300–4630. UPLC was performed on a Waters Acquity Bio-H Class with photodiode array UV detection at 220 nm. Samples were diluted to 1 mg/mL in formulation buffer and approximately 6 μg (6 μL) was injected from each sample. The mobile phase consisted of 150 mM sodium phosphate, pH 7.0, and an isocratic flow rate of 0.5 mL/min was applied. The run time for each analysis was 60 min. Samples were held in the autosampler at 8°C in a low profile 96-well plate (Thermo p/n AB0700). Chromeleon 7.2.10 (Thermo Dionex) software was used to program the run and for data analysis.

Analytical size exclusion chromatography

For samples at a concentration of 70 mg/mL, SEC was performed using a Waters Acquity H-Class composed of a quaternary solvent delivery system and equipped with a single Acquity UPLC BEH SEC, 200 A, 1.7 µm, 4.6 × 150 mm column (Waters Corporation, 186005225). The column was equilibrated with mobile phase consisting of 100 mM sodium phosphate, 250 mM NaCl, pH 6.8 ± 0.1. The samples (60 ± 10 µg protein load, 1 µL injection directly from the 70 mg/mL protein solution in 10 mM acetate, 9% sucrose, pH 5.2) were analyzed using a 6 min method at a flow rate of 0.4 mL/min with a column temperature at 25°C. Protein elution was monitored at 280 nm. For samples at a concentration of 1 mg/mL, SEC was performed using a Waters Acquity H-Class composed of a quaternary solvent delivery system and equipped with a single Acquity UPLC BEH SEC, 200 A, 1.7 µm, 4.6 × 150 mm column (Waters Corporation, 186005225). The column was equilibrated with mobile phase consisting of 100 mM sodium phosphate, 250 mM NaCl, pH 6.8 ± 0.1. The samples (3 µg protein load, 3 µL injection directly from a 1 mg/mL solution in either 10 mM sodium acetate, 9% sucrose, pH 5.2 or 10 mM acetate, 150 mM sodium chloride, pH 5.2) were analyzed using a 3.5 min method at a flow rate of 0.75 mL/min with a column temperature at 50°C. Protein elution was monitored at 220 nm. For both methods, the data were evaluated using Chromeleon software.

Hydrophobic interaction chromatography

Antibody analyte stocks (2 mg/mL) were diluted 1:1 in mobile phase A for a final concentration of 1 mg/mL. For the pH 6.0 assay, the mobile phase A was 1.8 M ammonium sulfate, 100 mM potassium phosphate, pH 6.0 and the mobile phase B was 100 mM potassium phosphate, pH 6.0. For the pH 7.4 assay, the mobile phase A was 1.8 M ammonium sulfate, 100 mM potassium phosphate, pH 7.4 and the mobile phase B was 100 mM potassium phosphate, pH 7.4.

HPLC was performed using an Agilent 1200 (Santa Clara, CA) system equipped with a binary pump (G1312B, an autosampler, a heated column compartment, and a diode array detector (G1315C)). System management and data acquisition were performed by the ChemStation software (B.04.03 v16). Chromatographic separation was achieved at 30°C on a MAbPac™ HIC-20 HPLC column (250 × 4.6 mm, 5 μm particle size; #088554, ThermoFisher Scientific). Ten µL of analyte were injected at a 0.5 mL/min flow rate and eluted over a 35-min gradient from 20% B to 100% B. Effluent was monitored at λex = 290 nm, λem 320 nm. Column drift was monitored using early- and late-eluting control mAbs injected approximately every 10 injections and was shown to be negligible (0.03 min retention time standard deviation at pH 6.0 (n = 6 per mAb); 0.2–0.3 min retention time standard deviation at pH 7.4 (n = 6 per mAb)).

Retention time and full-width half-max (FWHM) were determined using the OriginPro 2019 software package (OriginLab). Retention time was recorded by identifying the time at maximum peak height, and FWHM was determined by normalizing all the peaks to have a maximum height of 1 and fitting the normalized peak with a Gaussian function.

Heparin chromatography

The interaction of the tested molecules to heparin was characterized by chromatography using a HiTrap Heparin High performance 1 mL column (Cytiva, catalog# 17040601). Each protein (0.4 mg) was diluted in buffer A (50 mM Tris, 5 mM NaCl, pH 7.6) up to a final volume of 1 mL and then loaded onto a heparin column pre-equilibrated in buffer A. After loading, the column was washed with 5 column volumes buffer A, and then a linear gradient of 5–400 mM NaCl was applied over 20 column volumes. The conductivity at which the protein eluted (at the center of the elution peak) was used to characterize its heparin binding affinity. Molecules showing no interaction eluted at 100% buffer A conductivity while molecules with higher affinity elute at higher conductivity values. At the end of each run the column was washed with 5 column volumes 50 mM Tris, 2 M NaCl, pH 7.6 and re-equilibrated in buffer A before the next protein was tested.

AC-SINS

Goat anti-human Fc antibody (109-005-098, Jackson ImmunoResearch), the “capture antibody,” was buffer exchanged into 20 mM sodium acetate, pH 4.3 using a Zeba Spin desalting column (89891, ThermoFisher) and diluted to 0.4 mg/mL. One part capture antibody solution was incubated with nine parts citrate-stabilized unconjugated colloidal gold (7 × 1011 particles/mL, 15701–1, Ted Pella Inc) for 1 h at ambient temperature. Poly(ethylene glycol) methyl ether thiol (2000 Da, Sigma 729,140) in 20 mM sodium acetate, pH 4.3 was added to the mixture at a final concentration of 0.1 μM followed by an additional hour of incubation at ambient temperature. The solution was then passed through a 13-mm 0.22 μm PVDF membrane (SLGV013SL; Millex-GV) using a Luer-Lok sterile syringe (BD302995; Becton Dickinson). The gold particles were eluted from the membrane using 1/10th the starting volume of PBS. All test antibodies were formulated in 10 mM sodium acetate, 9% w/v sucrose, pH 5.2 and adjusted to a concentration of 1.0 mg/mL and 0.1 mg/mL. To immobilize the mAbs onto the conjugated nanoparticles, first 10 µL of the 10× concentrated capture antibody-conjugated nanoparticle solution was added to 80 µL of 10 mM sodium acetate, 150 mM sodium chloride, pH 5.2 in a UV transparent 384 well plate. Next, 10 µL of each mAb solution was added and the resulting mixture was incubated for 1 h at ambient temperature.

Absorption spectra were collected from 450 nm to 750 nm at an increment of 2 nm using a SpectraMax M5. Raw data were processed and analyzed by Amgen’s internal software for peak smoothing and fitting to determine the plasmon wavelength’s λmax. Δ λmax was calculated by subtracting the λmax of the non-associative control Ab5 from the λmax of each sample. Dynamic light scattering measurements of the antibody-conjugated gold nanoparticles were made at 25 ± 0.1°C using a DynaPro Plate Reader (Wyatt Technology Corporation) with a 384-well Aurora microplate (1012–00110 or BA2–00110). Triplicate 30 µL wells were measured for each sample using ten acquisitions of 2-s measurements and data analysis was performed using Dynamics 7.6.0.48 software.

Non-specific binding ELISAs

Methods for nonspecific binding ELISAs were modified from those published by Hötzel et al.Citation66 ELISA plates (Costar 3590) were coated with either 1) baculovirus particles (LakePharma 25,690) at 0.125 μg/well, 2) CHEM-1 cell membrane preparation (Millipore HTS000MC1) at 0.1 μg/well, 3) poly-D-Lysine (Millipore A-003-E) at 0.1 μg/well, or 4) PEI Max (Polysciences 24,765–2) at 0.1 μg/well. Plates were incubated at 4°C overnight. Then the plates were blocked with SuperBlock™ T20 (PBS) Blocking Buffer (Thermo Scientific, Cat#37516) for 1 h at room temperature. The plates were washed three times with PBS. Test articles were diluted to 100 nM with SuperBlock™ T20 (PBS) Blocking Buffer. Triplicate aliquots of each test article were added to the plate wells. The plates were incubated for 1 h at room temperature then were washed 6 times with PBS. Goat anti-human IgG conjugated to horseradish peroxidase (Jackson ImmunoResearch, Cat# 109-035-008) was added to each well. The plates were incubated for 30 min at room temperature and then washed 6 times with PBS. TMB (3,3‘,5,5’ tetramethylbenzidine) substrate (Thermo Scientific, Cat# 34021) was added to each well. After incubation for 15 min, reactions were stopped by adding 2 M sulfuric acid. Absorbance was read at 450 nm using an AquaMax 4000 Spectrophotometer (Molecular Devices). The assay scores were generated through dividing the OD450 value of each well by the OD450 of a non-coated well.

Thermolysin sensitivity

Each antibody was prepared in incubation buffer (35 mM Tris HCl pH 8.0, 262.5 mM NaCl, 17.5 mM CaCl2) and added to a 96-well flat bottom ultra-low attachment polystyrene plate (Corning 3474). Thermolysin from Geobacillus stearothermophilus (Sigma-Aldrich, T7902) was prepared in H2O and added to the proteins at a 10× molar concentration. For the no thermolysin groups, an equal volume of H2O was added. The samples were incubated at 37°C; at each time point ethylenediaminetetraacetic acid (EDTA) was added and the samples were moved to a new plate and frozen at −20°C. The sample concentrations were measured by immunoassay using biotinylated target protein as the capture reagent and ruthenylated anti-human FcCitation79 as the detection reagent. An in vitro thermolysin ratio was calculated by measuring the AUC with thermolysin present and dividing by the AUC without thermolysin.

FcRn affinity chromatography

FcRn affinity columns were purchased from the manufacturer (Roche Custom Biotech; cat# 08128057 001). Chromatography elution buffers were prepared fresh: 20 mM 2-(N-morpholino)ethanesulfonic acid (MES)/HCl, 140 mM NaCl, pH 5.5 (Buffer A); 20 mM Tris/HCl, 14 0 mM NaCl, pH 8.8 (Buffer B).

FcRn to huIgG interaction affinity was characterized by comparing the relative elution characteristics between molecules under standard gradient elution assay conditions. Briefly, each panel member was diluted 4-fold into Buffer A to a final mAb concentration of 0.50 mg/mL prior to injection (10 g) on column. A gradient elution method was applied after equilibration of the column in 20% Buffer B at 0.5 mL/min (pressure ≤10 Bar): 20% Buffer B (0–10 min), 100% Buffer B (80 min), 100% Buffer B (90 min), 20% Buffer B (93 min), 20% Buffer B (103 min). Antibody elution was monitored using both absorbance (280 nm) and intrinsic fluorescence (280 nm excitation, 350 nm emission). Analyte injections were performed only after verifying reproducible peak retention characteristics of calibration standards that included an antibody with standard FcRn affinity (Ab5) and an internal control in the IgG1-Y/T/E isotype with known high affinity for FcRn (huIgG1-Y/T/E). The Ab5 and huIgG1-Y/T/E calibrants were injected between every 12 analyte runs to verify column performance. The 83 member mAb panel was analyzed using a single column, where column performance graded by assessment of retention time drift of the calibration standards (≤0.1 min), as well as maintenance of system pressure (≤20% increase over all injections).

The data were exported into.csv format as a text file for manipulation in Origin Pro (OriginLab). First, the data were reduced by a factor of 10 by way of filtering. Next, for each affinity chromatogram a curve fitting routine was applied: asymmetric baseline fitting and subtraction, normalization of the chromatograms based on peak intensity, followed by fitting to a Gaussian peak model. Peak retention time (centroid) and full width at half maximum (FWHM) statistics were tabulated from resultant chromatogram transforms. For conversion of retention time to elution pH, the pH was calculated in RStudioCitation80 from the known buffer gradient composition over time via the methods of Nguyen et al.Citation81 and verified by measuring the pH at select time points throughout a typical run.

AlphaLisa FcRn binding assay method

Human FcRn tagged at the C-terminus with six histidines (FcRn-His) and human IgG1-Biotin (Fc-Biotin) produced internally at Amgen were prepared to a final concentration of 0.6 µg/mL and 0.18 µg/mL, respectively, in AlphaLisa Assay Buffer (25 mM (N-2-hydroxyethylpiperazine-N’-2-ethanesulfonic acid) (HEPES), 0.10 M NaCl, 0.2% bovine serum albumin, pH 6.0). Reference standard (RS) and sample material were serially diluted 3-fold from 2000 µg/mL to 0.914 µg/mL in Assay Buffer. The RS and samples were mixed by pipetting up-and-down ten times. Next, 20 µL each of the prepared FcRn-His and Fc-Biotin was added to each test well (blank wells were filled with Assay Buffer) of an assay plate, followed by 20 µL aliquots of RS and sample dilutions into their respective wells. The plate was sealed, shaken on high speed for 1 min, and then incubated for 1 ± 0.25 hours at room temperature in the dark. Near the end of the incubation, a mixture of streptavidin-coated donor beads and nickel nitriloacetic acid (Ni-NTA)chelate-coated acceptor beads was prepared, under dark conditions (beads are light sensitive), by diluting each to 25 µg/mL each in Assay Buffer. While maintaining dark conditions, 40 µL of the bead preparation was added to each test well. The plate was sealed and shaken in the dark for 1 min, followed by an incubation for 45 ± 15 min at room temperature in the dark. Post-incubation, the plate seal was removed, and the plate read on a PerkinElmer Envision 2104 with excitation and emission wavelengths of 680 nm and 570 nm, respectively. Each data point of the dilution curve was run in duplicate per assay plate across three independent assays, with sample results reported as the mean of the (n = 3) determinations. Data were fitted to the mean emission values using a 4-parameter curve fit using SoftMax Pro 5.4.1 and reported as percent relative binding, which is calculated as the ratio IC50 reference standard/IC50 sample times 100.

Mouse in vivo PK study

Animals

All mouse studies were conducted at Amgen Inc. (Thousand Oaks, CA) and were approved by the Institutional Animal Care and Use Committee. Female BALB/c mice were ordered at 6–8 weeks of age from Charles River Laboratories. For each dosing group, the test articles were combined into a single vial to generate the cassettes (a total of 5 mAbs/cassette). The dose level for each mAb per cassette was 2 mg/kg. For cassettes with less than 5 mAbs an equivalent amount of isotype control antibody was included to maintain the same total antibody load for each group. The mice received the test article cassettes through an intravenous bolus dose via the lateral tail vein. Whole blood samples were collected and processed to serum at various time points.

Analysis

Quantitation of mAbs in mouse serum was performed using multi-plex electro-chemiluminescent immunoassays using the Meso Scale Discovery (MSD) U-PLEX system on a Sector 600 instrument. In brief, biotinylated recombinant target proteins for each mAb were used as capture reagents by coupling to U-PLEX Linkers. The U-PLEX Linkers then self-assemble onto unique spots on the U-PLEX plate. After incubation of the samples on the coated plates, a ruthenylated anti-human Fc antibody was used as the detection reagent for all mAbs. In all assays, the analyte serum concentrations were interpolated from standard curves using the corresponding analyte prepared in pooled mouse serum. Noncompartmental analyses were performed using the mean of three animals per time point in non-Good Laboratory Practice (GLP) Watson LIMS v7.5 SP1 (Thermo Fisher Scientific). AUC values were calculated using the linear trapezoidal linear/log interpolation method.

Chemical liability analysis

Sample preparation

Sodium iodoacetate (I9148), glacial acetic acid (3420099-500 ML), and sodium acetate (32319-500 G-R) were obtained from Sigma Aldrich; 1 M Tris HCl at pH 7.5 (T1075) was obtained from Teknova; EDTA (BP118–500) was obtained from Fisher Scientific; DTT (A39255) and guanidine hydrochloride (GuHCl, 24115) were obtained from Pierce; and L-methionine (2085–05) was obtained from J.T. Baker. Trypsin (03708969001) was obtained from Roche and BioSpin-6 columns (732–6227) were obtained from Bio-Rad.

An aliquot of 100 µg of each sample was diluted to 1 mg/mL in denaturing buffer consisting of 7.5 M guanidine HCl, 250 mM Tris, and 2 mM EDTA for a final volume of 100 µL. DTT (3 µL at 500 mM) was added to this solution followed by incubation at 37°C for 35 min. Samples were allowed to cool to room temperature for 2 min followed by addition of 7 µL of 500 mM sodium iodoacetate and incubation in dark at room temperature for 20 min. Alkylation was quenched with addition of 4 µL of 500 mM DTT and samples were desalted with BioSpin-6 columns using the vendor’s recommended protocol into digest buffer consisting of 50 mM Tris and 20 mM methionine at pH 7.8. Trypsin (10 µL at 1 mg/mL) was added to each sample (1:10 enzyme:substrate ratio) followed by digestion for 35 min at 37°C. Digestion was quenched by diluting the digest mixture 1:1 with 8 M guanidine HCl and 250 mM acetate at pH 4.7.

Reversed phase tandem-MS analysis

For rpLC-MS/MS analysis, mAb digests were analyzed using an Agilent Infinity II UPLC linked to a Thermo Q-Exactive Plus mass spectrometer. The UPLC utilized a Waters ACQUITY BEH C18 column (2.1 mm x 150 mm, 1.7 µm, 130 Å) run at 250 µL/min at a temperature of 50°C. The LC buffers are A: 0.1% formic acid/water and B: 0.1% formic acid/acetonitrile. Peptide digest (6 µg) was injected from an approximate 0.3 µg/µL peptide digest solution and reversed phase separations were performed using the following gradient: 10 min at 1% B, 1 min to 10% B, 67 min to 45% B, 2 min to 90% B, and 5-min isocratic at 90% B, 2 min down to 1% B, a further 2 minutes at 1% B. At this point, a short 90% B wash is implemented. The total UPLC method time is 123 min. The column effluent passed through a UV flow cell (214 nm) prior to introduction to a divert valve in front of the mass spectrometer. The tandem MS method consists of Full MS over m/z [350–2000] at 70K resolution, followed by higher energy collisional dissociation (HCD) fragmentation of the top 10 most abundant precursors at 17.5K resolution. Full MS data is collected in profile mode, with HCD in centroid mode. Full MS uses an automatic gain control (AGC target): 1E6 with a max injection time of 100 msec. HCD fragmentation uses an AGC target: 5E4, max inject time: 200 msec, isolation width: 2.0 m/z, normalized collision energy (NCE): 27, and dynamic exclusion of 10 sec. MS source parameters include Spray voltage: 3.5 kV, Capillary Temp: 253°C, S-lens RF level: 50, Aux gas heater temp: 406°C, Sheath gas flow rate: 46, Aux gas flow rate: 11, and Sweep flow rate: 2. Data from rpLC-MS/MS was processed using both MassAnalyzerCitation42 (available in Biopharma Finder from Thermo Fisher) and Byos® software (Protein Metrics, Inc.)

Data processing and analysis

To develop an attribute-centric data model, we first analyzed the 83 sequences for all possible theoretically modifiable deamidation, isomerization, methionine oxidation and tryptophan oxidation sites. Mass analyzerCitation42 was used to automatically determine modification levels for all detectable attributes on each of the 83 molecules. The 83 resulting data tables were compiled, cleaned, and filtered for quality, then merged with the table of theoretically modifiable sites using Python (Python Software Foundation). Remaining blanks in the theoretical data table were assessed for peptide mapping coverage by comparing the detected signal from the unmodified peptide that would contain that modification to the maximum signal intensity in the data set for that molecule. Peptides with less than 1% of the max signal were marked as below the limit of quantitation and any potential modifications on these peptides were marked as “No Coverage.” The remaining potential modification sites for each molecule where no modification was detected by Mass Analyzer, but where MS signal for the unmodified peptide was above the 1% threshold were deemed to have peptide mapping coverage and assigned 0% modification values to reflect our expectation that, had these been modified, they would have been detected. A subset of the data was manually processed and confirmed in Skyline.Citation43

Development of viscosity prediction model

Data inclusion criteria

The 83 mAbs in the data set were filtered using the following inclusion criteria: 1) the viscosity measurement was made at a concentration that lies within 150 ± 10 mg/ml; 2) the viscosity measurement was 15 cP at a concentration <140 mg/ml; or 3) the viscosity measurement was <15 cP at a concentration >160 mg/ml. These last two criteria reflect the fact that viscosity increases monotonically with concentration, and so it is reasonable to infer whether viscosity is above or below 15 cP (our cutoff) at 150 ± 10 mg/ml, even if the measurements were made outside that interval. After applying the inclusion criteria, 71 of the 83 mAbs were used in model development. Of these, five were measured at concentrations <140 mg/ml and 3 were measured at concentrations >160 mg/ml. Using a cutoff of 15 cP to distinguish high from low viscosity, 43 of the 71 mAbs were low viscosity and 28 were high.

Feature engineering

We constructed two categories of features, structure-based and experimental.

Structure-based features: We computed 112 structure-based features using MOECitation35 on homology models of the Fabs assuming a pH of 5.2, based on the formulation pH. The structure-based features reflect biophysical attributes, including those that are global (e.g., pI) and local (e.g., surface patch charges).

Experimental features: For the viscosity model, we limited our analysis to attributes that can be measured using high-throughput, low consumption assays. Specifically, measurements from AC-SINS (DiffC, λ-max, Δλ-max, and the risk bin), DLS (Interaction Parameter kD, Zeta Potential), and PEG Precipitation (Conc50). These are referred to as the HT feature in the results section.

Model building and evaluation

We trained binary classifiers using XGBoost and Logistic Regression with an elastic net regularization penalty using the XGBoost (https://xgboost.readthedocs.io/en/stable/) and sci-kit learn libraries (https://scikit-learn.org/), respectively. Zero-variance features were removed, and the remaining features were standardized to have a mean of zero and a standard deviation of one. Hyperparameters were tuned using grid-search for Logistic Regression and randomized-search for XGBoost, and 3-fold cross-validation to maximize MCC, a common metric for evaluating models trained on imbalanced data sets. Hyperparameters for Logistic Regression included the regularization strength and whether to apply class weighting. Hyperparameters for XGBoost included the learning rate, the minimum loss reduction, maximum tree depth, minimum instance weight, subsampling ratio of the training instances, the L1 and L2 regularization weights, and the subsampling ratio of the features. Models were evaluated by performing 100 random stratified 70:30 splits to get reliable estimates of prediction performance, since we are dealing with a small dataset. The median performance of these 100 splits is reported in the Results. It was determined that the XGBoost classifiers, which learn non-linear decision boundaries were overfitting, so we excluded them from the results and just present the results of linear classifiers trained via Logistic Regression with an elastic net regularization penalty.

Development of PK prediction model

Data inclusion criteria

PK clearance data were available for 55 molecules, and these were used to train the model using a cutoff of 3.9 × 106 h x ng/mL for AUCt results in 50 low clearance and 5 high-clearance mAbs.

Feature engineering

The structure-based features used to train the PK clearance model were identical to those used to train the viscosity model. The experimental variables used to train the PK clearance model included: AC-SINS Δλmax, Thermolysin AUC ratio, DSF Tagg Onset, 1 mg/mL (°C), FcRn chromatography FWHM (min), Sepax retention time (min), Heparin Chromatography Conductivity at Elution Peak (mS/cm), SE-UHPLC % HMW, t = 0 wk, 1 mg/mL A52Su, Alphascreen FcRn IC50 (μM) pH 6, mAb-Pac HIC-20 pH 7.4 retention time (min), zeta Potential (mV), mAb-Pac HIC-20 pH 6 retention time (min), Membrane Prep Assay Score, % purity (nrMCE), PEG Precipitation Conc50%), BVP Assay Score, Heparin Chromatography % Buffer B at Elution Peak, Poly-D-Lysine Assay Score, DSF Tm1, 1 mg/mL (°C), DLS Interaction Parameter kD (mL/g), PEI Assay Score, SE-UHPLC Main Peak Retention Time (min), % purity (SEC), FcRn chromatography retention time (min), Ammonium Sulfate Precipitation Conc50 (M).

Model building and evaluation

The pipeline used to train and evaluate the PK clearance model was identical to that used to train and evaluate the viscosity model. Like the viscosity model, the XGBoost model for PK clearance was deemed to have overfit the data.

Abbreviations

AC-SINS=

affinity chromatography self-interaction nanoparticle spectroscopy

AGC=

automatic gain control

aSEC=

analytical size exclusion chromatography

AUC=

area under the curve

AUCt=

area under the plasma drug concentration-time curve

BVP=

baculovirus particle

CD=

circular dichroism

CDR-H1=

heavy chain complementarity-determining region 1

CDR-H2=

heavy chain complementarity-determining region 2

CDR-H3=

heavy chain complementarity-determining region 3

CDR-L1=

light chain complementarity-determining region 1

CDR-L2=

light chain complementarity-determining region 2

CDR-L3=

light chain complementarity-determining region 3

CE=

capillary electrophoresis

CEX=

cation exchange chromatography

CH1=

constant heavy 1

CH2=

constant heavy 2

CHO=

Chinese hamster ovary

CM=

conditioned media

cP=

Centipoise

DLS=

dynamic light scattering

DSC=

differential scanning calorimetry

DSF=

differential scanning fluorimetry

DTT=

Dithiothreitol

EDTA=

ethylenediaminetetraacetic acid

ELISA=

enzyme-linked immunosorbent assay

ESI=

electrospray ionization

Fab=

fragment antigen binding

FcRn=

neonatal Fc receptor

FT-IR=

Fourier transform infrared

FT-Raman=

Fourier transform Raman

Fv=

fragment variable

GLP=

Good Laboratory Practice

GuHCl=

guanidine hydrochloride

HCD=

higher energy collisional dissociation

HIC=

hydrophobic interaction chromatography

HMW=

high molecular weight

HT=

high throughput

HT SPE-MS=

high throughput solid phase extraction mass spectrometry

IEX=

ion exchange chromatography

kD=

diffusion interaction parameter

mAb=

monoclonal antibody

mAU=

miliAbsorbance Units

MCC=

Mathew’s Correlation Coefficient

MCE=

microcapillary electrophoresis

MES=

2-(N-morpholino)ethanesulfonic acid

MOE=

Molecular Operating Environment

mRNA=

messenger ribonucleic acid

MS=

mass spectrometry

MS/MS=

tandem mass spectrometry

MSD=

Meso Scale Discovery

MUSCLE=

MUltiple Sequence Comparison by Log- Expectation

Mw=

molecular weight

nanoDSF=

nano differential scanning fluorimetry

NCE=

normalized collision energy

Ni-NTA=

nickel nitriloacetic acid

NMR=

nuclear magnetic resonance

nrMCE=

non-reduced microcapillary electrophoresis

Oa-ToF=

orthogonal acceleration time of flight

PBS=

Phosphate-buffered saline

PCA=

Principal Components Analysis

PEG=

polyethylene glycol

PEI=

polyethylene imine

pI=

isoelectric point

PK=

pharmacokinetics

ProA=

protein A

RMSD=

root-mean-square deviation

rpLC-MS=

reversed phase liquid chromatographic mass spectrometry

rpLC-MS/MS=

reversed phase liquid chromatography tandem mass spectroscopy

RS=

reference standard

SARC=

Syngene-Amgen Research and Development Center

SEC=

size exclusion chromatography

SEC-MALS=

size exclusion chromatography-multi-angle light scattering

SEFL=

stable effector functionless

SE-UHPLC=

size exclusion ultra-high performance liquid chromatography

T0=

time zero

Tagg=

temperature at the onset of aggregation

TCEP=

tris (2-carboxyethyl) phosphine

TFA=

trifluoroacetic acid

Tm=

melting temperature

Tm1=

temperature at the midpoint of the first melting transition

TMB=

3,3‘,5,5’ tetramethylbenzidine

TMDD=

target-mediated drug disposition

Tonset=

temperature at the onset of unfolding

UPGMA=

unweighted pair group method with arithmetic mean

Supplemental material

Supplemental Material

Download Zip (6.8 MB)

Acknowledgments

We would like to acknowledge the Biotechnology Institute of Singapore for production of antibodies, and the Syngene Amgen Research and Development Center (SARC), Benjamin Alba, Fuyi Chen, Michelle Hortter, and Ling Liu for production of Fabs. We thank Dhritiman Jana, Philip An, Evelyn Yang, and Heidi Jones for support registering protein lots, Kevin Kalenian for running the FcRn AlphaScreen assay, and Edward Williams and Adam Birkholz for building our internal DeepAb webserver. The authors thank Professor Patrick Underhill for helpful comments on the manuscript.

Disclosure statement

The authors declare the following competing financial interest(s): M.M., C.J.L., A.S., D.Y., S.C.H., M.A., L.A., V.B., H.C., H-T.C., K.P.C, K.D.C., A.D., S.G-R., K.G., P.G., J.O.H., M.J., S.J., N.K., R.P., E.M.P-O., W.Q., A.J.R., J.S., V.A.T., S.v.D., R.V., V.W., K.W.W., Y.W., M.Y., and I.D.G.C. are full-time employees and shareholders of Amgen Inc. A.W.J., N.A., C-C.C., A.R.C., J. H., V.J., L.J., C.K., L.N., V.R., R.S., C.S., D.W, and Y.S. are shareholders of Amgen Inc.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/19420862.2023.2256745

Additional information

Funding

The author(s) reported that there is no funding associated with the work featured in this article.

References

  • Buntz B. 50 of 2021’s best-selling pharmaceuticals, drug Discovery & development. Cleveland, OH: WTWH Media; 2022 Mar 29 [accessed 2022 Jul 1].
  • Hong P, Koza S, Bouvier ES. Size-exclusion chromatography for the analysis of protein Biotherapeutics and their aggregates. J Liq Chromatogr Relat Technol. 2012;35(20):2923–21. doi:10.1080/10826076.2012.743724. PMID: 23378719.
  • Haverick M, Mengisen S, Shameem M, Ambrogelly A. Separation of mAbs molecular variants by analytical hydrophobic interaction chromatography HPLC: overview and applications. MAbs. 2014;6(4):852–58. doi:10.4161/mabs.28693. PMID: 24751784.
  • Harris RJ, Kabakoff B, Macchi FD, Shen FJ, Kwong M, Andya JD, Shire SJ, Bjork N, Totpal K, Chen AB. Identification of multiple sources of charge heterogeneity in a recombinant antibody. J Chromatogr B Biomed Sci Appl. 2001;752:233–45. doi:10.1016/s0378-4347(00)00548-x. PMID: 11270864.
  • Suntornsuk L. Recent advances of capillary electrophoresis in pharmaceutical analysis. Anal Bioanal Chem. 2010;398(1):29–52. doi:10.1007/s00216-010-3741-5. PMID: 20437226.
  • Temel DB, Landsman P, Brader ML. Orthogonal methods for characterizing the unfolding of therapeutic monoclonal antibodies: differential scanning calorimetry, isothermal chemical denaturation, and intrinsic fluorescence with concomitant static light scattering. Methods Enzymol. 2016;567:359–89. doi:10.1016/bs.mie.2015.08.029. PMID: 26794361.
  • Garidel P, Hegyi M, Bassarab S, Weichel M. A rapid, sensitive and economical assessment of monoclonal antibody conformational stability by intrinsic tryptophan fluorescence spectroscopy. Biotechnol J. 2008;3:1201–11. doi:10.1002/biot.200800091. PMID: 18702089.
  • Luypaert J, Massart DL, Vander Heyden Y. Near-infrared spectroscopy applications in pharmaceutical analysis. Talanta. 2007;72(3):865–83. doi:10.1016/j.talanta.2006.12.023. PMID: 19071701.
  • Poppe L, Jordan JB, Lawson K, Jerums M, Apostol I, Schnier PD. Profiling formulated monoclonal antibodies b1H NMR spectroscopy. Anal Chem. 2013;85(20):9623–29. doi:10.1021/ac401867f. PMID: 24006877.
  • Tetin SY, Prendergast FG, Venyaminov SY. Accuracy of protein secondary structure determination from circular dichroism spectra based on immunoglobulin examples. Anal Biochem. 2003;321:183–87. doi:10.1016/s0003-2697(03)00458-5. PMID: 14511682.
  • Minton AP. Recent applications of light scattering measurement in the biological and biopharmaceutical sciences. Anal Biochem. 2016;501:4–22. doi:10.1016/j.ab.2016.02.007. PMID: 26896682.
  • Tomar DS, Kumar S, Singh SK, Goswami S, Li L. Molecular basis of high viscosity in concentrated antibody solutions: strategies for high concentration drug product development. MAbs. 2016;8(2):216–28. doi:10.1080/19420862.2015.1128606. PMID: 26736022.
  • Campuzano IDG, Sandoval W. Denaturing and native mass spectrometric analytics for biotherapeutic drug discovery research: historical, current, and future personal perspectives. J Am Soc Mass Spectrom. 2021;32:1861–85. doi:10.1021/jasms.1c00036. PMID: 33886297.
  • Song YE, Dubois H, Hoffmann M, DE S, Fromentin Y, Wiesner J, Pfenninger A, Clavier S, Pieper A, Duhau L, et al. Automated mass spectrometry multi-attribute method analyses for process development and characterization of mAbs. J Chromatogr B Analyt Technol Biomed Life Sci. 2021;1166:122540. doi:10.1016/j.jchromb.2021.122540. PMID: 33545564.
  • Jain T, Sun T, Durand S, Hall A, Houston Nga R, Nett Juergen H, Sharkey B, Bobrowicz B, Caffry I, Yu Y, et al. Biophysical properties of the clinical-stage antibody landscape. Proceedings of the National Academy of Sciences. 2017;114:944–49. doi:10.1073/pnas.1616408114.
  • Bailly M, Mieczkowski C, Juan V, Metwally E, Tomazela D, Baker J, Uchida M, Kofman E, Raoufi F, Motlagh S, et al. Predicting antibody developability profiles through early stage Discovery screening. MAbs. 2020;12(1):1743053. doi:10.1080/19420862.2020.1743053.
  • Kingsbury JS, Saini A, Auclair SM, Fu L, Lantz MM, Halloran KT, Calero-Rubio C, Schwenger W, Airiau CY, Zhang J, et al. A single molecular descriptor to predict solution behavior of therapeutic antibodies. Sci Adv. 2020;6(32):eabb0372. doi:10.1126/sciadv.abb0372. PMID: 32923611.
  • Vatsa S. In silico prediction of post-translational modifications in therapeutic antibodies. MAbs. 2022;14(1):2023938. doi:10.1080/19420862.2021.2023938.
  • Ferreira GM, Calero-Rubio C, Sathish HA, Remmele RL Jr., Roberts CJ. Electrostatically mediated protein-protein interactions for monoclonal antibodies: a combined experimental and coarse-grained molecular modeling approach. J Pharm Sci. 2019;108:120–32. doi:10.1016/j.xphs.2018.11.004. PMID: 30419274.
  • Virk SS, Underhill PT. Application of a simple short-range attraction and long-range repulsion colloidal model toward predicting the viscosity of protein solutions. Mol Pharm. 2022;19(11):4233–40. doi:10.1021/acs.molpharmaceut.2c00582. PMID: 36129361.
  • Sankar K, Krystek SR Jr., Carl SM, Day T, Maier JKX. AggScore: prediction of aggregation-prone regions in proteins based on the distribution of surface patches. Proteins. 2018;86(11):1147–56. doi:10.1002/prot.25594. PMID: 30168197.
  • Liu S, Verma A, Kettenberger H, Richter WF, Shah DK. Effect of variable domain charge on in vitro and in vivo disposition of monoclonal antibodies. MAbs. 2021;13(1):1993769. doi:10.1080/19420862.2021.1993769. PMID: 34711143.
  • Chennamsetty N, Voynov V, Kayser V, Helk B, Trout BL. Prediction of aggregation prone regions of therapeutic proteins. J Phys Chem B. 2010;114(19):6614–24. doi:10.1021/jp911706q. PMID: 20411962.
  • Tomar DS, Li L, Broulidakis MP, Luksha NG, Burns CT, Singh SK, Kumar S. In-silico prediction of concentration-dependent viscosity curves for monoclonal antibody solutions. MAbs. 2017;9(3):476–89. doi:10.1080/19420862.2017.1285479. PMID: 28125318.
  • Grinshpun B, Thorsteinson N, Pereira JN, Rippmann F, Nannemann D, Sood VD, Fomekong Nanfack Y. Identifying biophysical assays and in silico properties that enrich for slow clearance in clinical-stage therapeutic antibodies. MAbs. 2021;13(1):1932230. doi:10.1080/19420862.2021.1932230. PMID: 34116620.
  • Lai PK, Swan JW, Trout BL. Calculation of therapeutic antibody viscosity with coarse-grained models, hydrodynamic calculations and machine learning-based parameters. MAbs. 2021;13(1):1907882. doi:10.1080/19420862.2021.1907882. PMID: 33834944.
  • Akbar R, Bashour H, Rawat P, Robert PA, Smorodina E, Cotet TS, Flem-Karlsen K, Frank R, Mehta BB, Vu MH, et al. Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies. MAbs. 2022;14(1):2008790. doi:10.1080/19420862.2021.2008790. PMID: 35293269.
  • Rai BK, Apgar JR, Bennett EM. Low-data interpretable deep learning prediction of antibody viscosity using a biophysically meaningful representation. Sci Rep. 2023;13(1):2917. doi:10.1038/s41598-023-28841-4. PMID: 36806303.
  • Lai PK, Gallegos A, Mody N, Sathish HA, Trout BL. Machine learning prediction of antibody aggregation and viscosity for high concentration formulation development of protein therapeutics. MAbs. 2022;14(1):2026208. doi:10.1080/19420862.2022.2026208. PMID: 35075980.
  • Lai PK, Fernando A, Cloutier TK, Gokarn Y, Zhang J, Schwenger W, Chari R, Calero-Rubio C, Trout BL. Machine learning applied to determine the molecular descriptors responsible for the viscosity behavior of concentrated therapeutic antibodies. Mol Pharm. 2021;18(3):1167–75. doi:10.1021/acs.molpharmaceut.0c01073. PMID: 33450157.
  • Jacobsen FW, Stevenson R, Li C, Salimi-Moosavi H, Liu L, Wen J, Luo Q, Daris K, Buck L, Miller S, et al. Engineering an IgG scaffold lacking effector function with optimized developability. J Biol Chem. 2017;292:1865–75. doi:10.1074/jbc.M116.748525. PMID: 27994062.
  • Russell W, Burch R. The principles of humane experimental technique. London, UK: Methuen; 1959.
  • Dillon M, Yin Y, Zhou J, McCarty L, Ellerman D, Slaga D, Junttila TT, Han G, Sandoval W, Ovacik MA, et al. Efficient production of bispecific IgG of different isotypes and species of origin in single mammalian cells. MAbs. 2017;9(2):213–30. doi:10.1080/19420862.2016.1267089. PMID: 27929752.
  • Dillon TM, Ricci MS, Vezina C, Flynn GC, Liu YD, Rehder DS, Plant M, Henkle B, Li Y, Deechongkit S, et al. Structural and functional characterization of disulfide isoforms of the human IgG2 subclass. J Biol Chem. 2008;283:16206–15. doi:10.1074/jbc.M709988200. PMID: 18339626.
  • Molecular Operating Environment (MOE), 2020.09. [Software] 1010 Sherbooke St. West, suite #910, Montreal, QC, Canada, H3A 2R7: Chemical Computing Group ULC; 2022.
  • BIOVIA. Discovery Studio. [software] 5005 Wateridge Vis Drive. San Diego CA 92121: Dassault Systemes; 2022.
  • Schrodinger Release 2022-2: Maestro[Software]. New York, NY, 2021: Schrödinger, LLC; 2021.
  • Ruffolo JA, Sulam J, Gray JJ. Antibody structure prediction using interpretable deep learning. bioRxiv. 2021;3(2):445982. doi:10.1101/2021.05.27.445982.
  • Wen J, Lord H, Knutson N, Wikstrom M. Nano differential scanning fluorimetry for comparability studies of therapeutic proteins. Anal Biochem. 2020;593:113581. doi:10.1016/j.ab.2020.113581. PMID: 31935356.
  • Sharma VK, Patapoff TW, Kabakoff B, Pai S, Hilario E, Zhang B, Li C, Borisov O, Kelley RF, Chorny I, et al. In silico selection of therapeutic antibodies for development: viscosity, clearance, and chemical stability. Proc Natl Acad Sci U S A. 2014;111:18601–06. doi:10.1073/pnas.1421779112. PMID: 25512516.
  • Jacobitz AW, Rodezno W, Agrawal NJ. Utilizing cross-product prior knowledge to rapidly de-risk chemical liabilities in therapeutic antibody candidates. AAPS Open. 2022;8(1):10. doi:10.1186/s41120-022-00057-2.
  • Zhang Z. Large-scale identification and quantification of covalent modifications in therapeutic proteins. Anal Chem. 2009;81:8354–64. doi:10.1021/ac901193n. PMID: 19764700.
  • MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler DC, MacCoss MJ. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics. 2010;26(7):966–68. doi:10.1093/bioinformatics/btq054.
  • Lu X, Nobrega RP, Lynaugh H, Jain T, Barlow K, Boland T, Sivasubramanian A, Vásquez M, Xu Y. Deamidation and isomerization liability analysis of 131 clinical-stage antibodies. MAbs. 2019;11(1):45–57. doi:10.1080/19420862.2018.1548233. PMID: 30526254.
  • Sydow JF, Lipsmeier F, Larraillet V, Hilger M, Mautz B, Mølhøj M, Kuentzer J, Klostermann S, Schoch J, Voelger HR, et al. Structure-based prediction of asparagine and aspartate degradation sites in antibody variable regions. PloS One. 2014;9(6):e100736. doi:10.1371/journal.pone.0100736. PMID: 24959685.
  • Honegger A, Plückthun A. Yet another numbering scheme for immunoglobulin variable domains: an automatic modeling and analysis tool. J Mol Biol. 2001;309(3):657–70. doi:10.1006/jmbi.2001.4662. PMID: 11397087.
  • Liu Y, Caffry I, Wu J, Geng SB, Jain T, Sun T, Reid F, Cao Y, Estep P, Yu Y, et al. High-throughput screening for developability during early-stage antibody discovery using self-interaction nanoparticle spectroscopy. MAbs. 2014;6(2):483–92. doi:10.4161/mabs.27431.
  • Geoghegan JC, Fleming R, Damschroder M, Bishop SM, Sathish HA, Esfandiary R. Mitigation of reversible self-association and viscosity in a human IgG1 monoclonal antibody by rational, structure-guided Fv engineering. MAbs. 2016;8(5):941–50. doi:10.1080/19420862.2016.1171444.
  • Geng SB, Cheung JK, Narasimhan C, Shameem M, Tessier PM. Improving monoclonal antibody selection and engineering using measurements of colloidal protein interactions. J Pharm Sci. 2014;103(11):3356–63. doi:10.1002/jps.24130.
  • Thiagarajan G, Semple A, James JK, Cheung JK, Shameem M. A comparison of biophysical characterization techniques in predicting monoclonal antibody stability. MAbs. 2016;8(6):1088–97. doi:10.1080/19420862.2016.1189048.
  • Li L, Kumar S, Fau - Buck PM, Buck PM, Fau - Burns C, Burns C, Fau - Lavoie J, Lavoie J, Fau - Singh SK, Singh S, et al. Concentration dependent viscosity of monoclonal antibody solutions: explaining experimental behavior in terms of molecular properties. Pharm Res. 2014;31(11):3161–78. doi:10.1007/s11095-014-1409-0.
  • Garidel P, Blume A, Wagner M. Prediction of colloidal stability of high concentration protein formulations. Pharm Dev Technol. 2015;20(3):367–74. doi:10.3109/10837450.2013.871032.
  • He F, Woods CE, Becker GW, Narhi LO, Razinkov VI. High‐throughput assessment of thermal and colloidal stability parameters for monoclonal antibody formulations. J Pharm Sci. 2011;100(12):5126–41. doi:10.1002/jps.22712.
  • He F, Becker GW, Litowski JR, Narhi LO, Brems DN, Razinkov VI. High-throughput dynamic light scattering method for measuring viscosity of concentrated protein solutions. Anal Biochem. 2010;399(1):141–43. doi:10.1016/j.ab.2009.12.003.
  • Connolly Brian D, Petry C, Yadav S, Demeule B, Ciaccio N, Moore Jamie MR, Shire Steven J, Gokarn Yatin R. Weak interactions govern the viscosity of concentrated antibody solutions: high-throughput analysis using the diffusion interaction parameter. Biophys J. 2012;103(1):69–78. doi:10.1016/j.bpj.2012.04.047.
  • Yadav S, Shire SJ, Kalonia DS. Viscosity behavior of high-concentration monoclonal antibody solutions: correlation with interaction parameter and electroviscous effects. J Pharm Sci. 2012;101(3):998–1011. doi:10.1002/jps.22831.
  • Hu S, Datta-Mannan A, D’Argenio DZ. Physiologically based modeling to predict monoclonal antibody pharmacokinetics in humans from in vitro physiochemical properties. MAbs. 2022;14(1):2056944. doi:10.1080/19420862.2022.2056944.
  • Avery LB, Wade J, Wang M, Tam A, King A, Piche-Nicholas N, Kavosi MS, Penn S, Cirelli D, Kurz JC, et al. Establishing in vitro in vivo correlations to screen monoclonal antibodies for physicochemical properties related to favorable human pharmacokinetics. MAbs. 2018;10(2):244–55. doi:10.1080/19420862.2017.1417718.
  • Grinshpun B-O, Thorsteinson N, Pereira JN, Rippmann F, Nannemann D, Sood V-O, Fomekong Nanfack Y-O. Identifying biophysical assays and in silico properties that enrich for slow clearance in clinical-stage therapeutic antibodies. MAbs. 2021;13(1):1932230. doi:10.1080/19420862.2021.1932230.
  • Bumbaca Yadav D, Sharma VK, Boswell CA, Hotzel I, Tesar D, Shang Y, Ying Y, Fischer SK, Grogan JL, Chiang EY, et al. Evaluating the use of antibody variable region (Fv) charge as a risk assessment tool for predicting typical cynomolgus monkey pharmacokinetics *. J Biol Chem. 2015;290(50):29732–41. doi:10.1074/jbc.M115.692434.
  • Schoch A, Kettenberger H, Mundigl O, Winter G, Engert J, Heinrich J, Emrich T Charge-mediated influence of the antibody variable domain on FcRn-dependent pharmacokinetics. Proceedings of the National Academy of Sciences. 2015;112:5997–6002. doi:10.1073/pnas.1408766112.
  • Datta-Mannan A, Lu J, Witcher DR, Leung D, Tang Y, Wroblewski VJ. The interplay of non-specific binding, target-mediated clearance and FcRn interactions on the pharmacokinetics of humanized antibodies. MAbs. 2015;7(6):1084–93. doi:10.1080/19420862.2015.1075109. PMID: 26337808.
  • Betts A, Keunecke A-O, van Steeg TJ, van der Graaf PH, Avery L-O, Jones H, Berkhout J. Linear pharmacokinetic parameters for monoclonal antibodies are similar within a species and across different pharmacological targets: a comparison between human, cynomolgus monkey and hFcrn Tg32 transgenic mouse using a population-modeling approach. MAbs. 2018;10(5):751–64. doi:10.1080/19420862.2018.1462429.
  • Kelly RL, Sun T, Jain T, Caffry I, Yu Y, Cao Y, Lynaugh H, Brown M, Vásquez M, Wittrup KD, et al. High throughput cross-interaction measures for human IgG1 antibodies correlate with clearance rates in mice. MAbs. 2015;7(4):770–77. doi:10.1080/19420862.2015.1043503. PMID: 26047159.
  • Kraft TE, Richter WF, Emrich T, Knaupp A, Schuster M, Wolfert A, Kettenberger H. Heparin chromatography as an in vitro predictor for antibody clearance rate through pinocytosis. MAbs. 2020;12(1):1683432. doi:10.1080/19420862.2019.1683432.
  • Hötzel I, Theil FP, Bernstein LJ, Prabhu S, Deng R, Quintana L, Lutman J, Sibia R, Chan P, Bumbaca D, et al. A strategy for risk mitigation of antibodies with fast clearance. MAbs. 2012;4(6):753–60. doi:10.4161/mabs.22189. PMID: 23778268.
  • Seo N, Polozova A, Zhang M, Yates Z, Cao S, Li H, Kuhns S, Maher G, McBride HJ, Liu J. Analytical and functional similarity of Amgen biosimilar ABP 215 to bevacizumab. MAbs. 2018;10(4):678–91. doi:10.1080/19420862.2018.1452580. PMID: 29553864.
  • Schlothauer T, Rueger P, Stracke JO, Hertenberger H, Fingas F, Kling L, Emrich T, Drabner G, Seeber S, Auer J, et al. Analytical FcRn affinity chromatography for functional characterization of monoclonal antibodies. MAbs. 2013;5(4):576–86. doi:10.4161/mabs.24981.
  • Igawa T, Tsunoda H, Tachibana T, Maeda A, Mimoto F, Moriyama C, Nanami M, Sekimori Y, Nabuchi Y, Aso Y, et al. Reduced elimination of IgG antibodies by engineering the variable region. Protein Eng Des Sel. 2010;23:385–92. doi:10.1093/protein/gzq009. PMID: 20159773.
  • Bumbaca Yadav D, Sharma VK, Boswell CA, Hotzel I, Tesar D, Shang Y, Ying Y, Fischer SK, Grogan JL, Chiang EY, et al. Evaluating the use of antibody variable region (Fv) charge as a risk assessment tool for predicting typical cynomolgus monkey pharmacokinetics. J Biol Chem. 2015;290(50):29732–41. doi:10.1074/jbc.M115.692434. PMID: 26491012.
  • Datta-Mannan A, Thangaraju A, Leung D, Tang Y, Witcher DR, Lu J, Wroblewski VJ. Balancing charge in the complementarity-determining regions of humanized mAbs without affecting pI reduces non-specific binding and improves the pharmacokinetics. MAbs. 2015;7(3):483–93. doi:10.1080/19420862.2015.1016696. PMID: 25695748.
  • Sun Y, Cai H, Hu Z, Boswell CA, Diao J, Li C, Zhang L, Shen A, Teske CA, Zhang B, et al. Balancing the affinity and pharmacokinetics of antibodies by modulating the size of charge patches on complementarity-determining regions. J Pharm Sci. 2020;109:3690–96. doi:10.1016/j.xphs.2020.09.003. PMID: 32910947.
  • Ollier R, Fuchs A, Gauye F, Piorkowska K, Menant S, Ratnam M, Montanari P, Guilhot F, Phillipe D, Audrain M, et al. Improved antibody pharmacokinetics by disruption of contiguous positive surface potential and charge reduction using alternate human framework. MAbs. 2023;15(1):2232087. doi:10.1080/19420862.2023.2232087. PMID: 37408314.
  • Boswell CA, Tesar DB, Mukhyala K, Theil FP, Fielder PJ, Khawli LA. Effects of charge on antibody tissue distribution and pharmacokinetics. Bioconjug Chem. 2010;21(12):2153–63. doi:10.1021/bc100261d. PMID: 21053952.
  • Campuzano IDG, Robinson JH, Hui JO, Shi SD, Netirojjanakul C, Nshanian M, Egea PF, Lippens JL, Bagal D, Loo JA, et al. Native and denaturing MS protein Deconvolution for biopharma: monoclonal antibodies and antibody–drug conjugates to Polydisperse membrane proteins and beyond. Anal Chem. 2019;91(15):9472–80. doi:10.1021/acs.analchem.9b00062. PMID: 31194911.
  • Kabsch W. A solution for the best rotation to relate two sets of vectors. Acta Cryst A. 1976;32(5):922–23. doi:10.1107/S0567739476001873.
  • Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinform. 2004;5(1):113. doi:10.1186/1471-2105-5-113. PMID: 15318951.
  • Qi W, Alekseychyk L, Nuanmanee N, Temel DB, Jann V, Treuheit M, Razinkov VP. Resolving liquid-liquid phase separation for a peptide fused monoclonal antibody by formulation optimization. J Pharm Sci. 2021;110(2):738–45. doi:10.1016/j.xphs.2020.09.020.
  • Shih JY, Patel V, Watson A, Hager T, Luan P, Salimi-Moosavi H, Ma M. Implementation of a universal analytical method in early-stage development of human antibody therapeutics: application to pharmacokinetic assessment for candidate selection. Bioanalysis. 2012;4(19):2357–65. doi:10.4155/bio.12.201. PMID: 23088462.
  • Team R. RStudio: integrated development for R. RStudio, PBC; 2020. http://www.rstudio.com/
  • Nguyen MK, Kao L, Kurtz I. Calculation of the equilibrium pH in a multiple-buffered aqueous solution based on partitioning of proton buffering: a new predictive formula. Am J Physiol-Renal. 2009;296(6):F1521–F29. doi:10.1152/ajprenal.90651.2008. PMID: 19339630.