2,680
Views
0
CrossRef citations to date
0
Altmetric
Report

Predicting deamidation and isomerization sites in therapeutic antibodies using structure-based in silico approaches

, , , , & ORCID Icon
Article: 2333436 | Received 08 Sep 2023, Accepted 18 Mar 2024, Published online: 28 Mar 2024

ABSTRACT

Asparagine (Asn) deamidation and aspartic acid (Asp) isomerization are common degradation pathways that affect the stability of therapeutic antibodies. These modifications can pose a significant challenge in the development of biopharmaceuticals. As such, the early engineering and selection of chemically stable monoclonal antibodies (mAbs) can substantially mitigate the risk of subsequent failure. In this study, we introduce a novel in silico approach for predicting deamidation and isomerization sites in therapeutic antibodies by analyzing the structural environment surrounding asparagine and aspartate residues. The resulting quantitative structure-activity relationship (QSAR) model was trained using previously published forced degradation data from 57 clinical-stage mAbs. The predictive accuracy of the model was evaluated for four different states of the protein structure: (1) static homology models, (2) enhancing low-frequency vibrational modes during short molecular dynamics (MD) runs, (3) a combination of (2) with a protonation state reassignment, and (4) conventional full-atomistic MD simulations. The most effective QSAR model considered the accessible surface area (ASA) of the residue, the pKa value of the backbone amide, and the root mean square deviations of both the alpha carbon and the side chain. The accuracy was further enhanced by incorporating the QSAR model into a decision tree, which also includes empirical information about the sequential successor and the position in the protein. The resulting model has been implemented as a plugin named “Forecasting Reactivity of Isomerization and Deamidation in Antibodies” in MOE software, completed with a user-friendly graphical interface to facilitate its use.

Introduction

Monoclonal antibodies (mAbs) and modern antibody-related products are well-established therapeutics that greatly contribute to the economic success of the biopharmaceutical industry.Citation1 Blockbuster drugs as Humira, Keytruda, and Stelara exemplify the remarkable therapeutic and economic value of these mAbs. However, before obtaining market authorization, new drug candidates face high attrition rates in research and development (R&D).Citation2,Citation3 To circumvent costly late-stage failures, new drug candidates are characterized and optimized using a set of predictive experimental and computational techniques, following the quality-by-design approach.Citation4,Citation5 Critical quality attributes (CQAs), including the chemical stability of the complementarity-determining regions (CDRs), are crucial because non-enzymatic modifications can affect the mAb structure and function, comprising drug safety and efficacy.Citation6–9

Asparagine (Asn) deamidation and aspartic acid (Asp) isomerization are among the most prevalent chemical modifications of mAbs.Citation10 The deamidation reaction occurs under physiological conditions but is significantly accelerated in an alkaline environment,Citation9,Citation11,Citation12 whereas the isomerization reaction is accelerated under neutral to acidic conditions.Citation13,Citation14 Both reactions follow an interrelated pathway (), beginning with a nucleophilic attack by the peptide bond nitrogen (NHn + 1) of the carboxyl-flanking residue on the γ-carbonyl (Cγ) of the Asx (Asn/Asp) side chain to form an aspartyl succinimide intermediate.Citation15 This intermediate then transforms into Asp or iso-aspartic acid (Iso-Asp) through hydrolysis, with the rates influenced by the steric and electrostatic environment within the protein structure.Citation10,Citation16–24

Figure 1. Mechanism of asparagine deamidation and aspartate isomerization through an aspartyl-succinimide intermediate. The R group represents the side chain of the subsequent amino acid relative to the reference asparagine/aspartic acid.

Figure 1. Mechanism of asparagine deamidation and aspartate isomerization through an aspartyl-succinimide intermediate. The R group represents the side chain of the subsequent amino acid relative to the reference asparagine/aspartic acid.

The first computational approaches to predict the degradation of Asx sites were based on pairwise sequence motifs. These approaches indirectly considered structural parameters such as conformational flexibility, the size of the C-terminally flanking residue, and secondary structure.Citation13,Citation15,Citation20,Citation25–28 Later, the consideration of the CDR identityCitation10 and secondary structure propensityCitation29 further increased the predictive power. Other computational studies improved the predictive models even more by analyzing the complex conformational environment of the reactive groups within the mAb structure.Citation19,Citation30–32 The most important molecular descriptors are generally related to local structural elements, structure flexibility, solvent accessibility, and relative geometrical arrangements of the amino acids such as torsion angles or the distance between the reacting carbon and nitrogen.

In this study, we computationally modeled and analyzed the structures of 57 clinical-stage antibodies using conformational ensembles across a wide pH range. Several molecular descriptors have been identified to be predictive of Asx degradation by correlation with the experimental data from literature.Citation10 The results were evaluated to generate a quantitative structure-activity relationship (QSAR) model that predicts the degradation risk of Asn deamidation and Asp isomerization sites in mAb CDRs with a balanced accuracy of 79% and 77%, respectively.

Results

Selection of molecular properties related to chemical degradation

Given the degradation pathways of deamidation and isomerization in proteins (), we selected eight molecular descriptors to evaluate their capability to predict liable Asx residues (). To account for the steric hindrance in the degradation pathway, we considered the flexibility of residues by calculating the root mean squared deviation (RMSD) of the alpha carbon (Cα), as well as the RMSD of all atoms in the side chain (CA_rmsd and SC_rmsd). Furthermore, the acidity of NHn + 1 in the residue following Asx on the C-terminal side of the protein sequence plays a crucial role in the nucleophilic attack of Cγ, as the deprotonation of this group is necessary for the formation of the succinimide intermediate,Citation17 ,Citation33 ,Citation34 Therefore, we evaluated the pKa value of NHn + 1 as an additional residue descriptor (Amide_pKa). For isomerization, the pKa value of the Asp side chain was also considered (Asp_pKa) because the carboxyl group needs to be in the protonated form (C(=O)OH) to allow the dehydration reaction.Citation16,Citation27 Since water molecules mediate the nucleophilic attack on the carbonyl group in the succinimide intermediate, we calculated the solvent-accessible surface area (ASA) of the side chain of Asx residues (ASA), as well as the ASA of the adjacent residue (ASA(n + 1)). To specifically consider the solvent exposure of the backbone, we also calculated the ASA of NHn + 1 (Area_amide). In addition, for all conformational representations used herein, the number of sampled conformations with an Asx motif having Area_amide above zero was divided by the total amount of sampled conformations to obtain the fraction of conformations with water-exposed NHn + 1 moieties (Frac_amide).

Table 1. Selection of eight molecular descriptors used for the prediction of deamidation and isomerization in the CDRs of antibodies.

We conducted a preliminary screening by sampling 50 conformations using Low-Mode molecular dynamics (MD). This method generates diverse structures in a short time by using low-frequency vibrational modes during brief MD runs.Citation35 The protonation state of the proteins was adjusted to pH 6, the most common formulation condition used in therapeutic antibodies.Citation36 The aim of this preliminary screening was to detect correlations with the experimental degradation rates and establish possible descriptor associations for testing different conformational sampling techniques. The collected molecular descriptors from all 452 Asx motifs were correlated with the corresponding chemical degradation rate, expressed as a binary variable (degraded or not). The correlation matrix, shown in , highlights significant correlations with several molecular descriptors.

Figure 2. Correlation matrices of the degradation rate and the descriptors obtained in a preliminary screening. The screening involved sampling 50 conformations using low-mode MD at pH 6. The Pearson’s correlation coefficient multiplied by 100 is shown. The matrices show the correlation of the binary degradation rate (binary deg.rate) and the descriptors from (a) 260 Asn motifs and (b) 192 Asp motifs.

Figure 2. Correlation matrices of the degradation rate and the descriptors obtained in a preliminary screening. The screening involved sampling 50 conformations using low-mode MD at pH 6. The Pearson’s correlation coefficient multiplied by 100 is shown. The matrices show the correlation of the binary degradation rate (binary deg.rate) and the descriptors from (a) 260 Asn motifs and (b) 192 Asp motifs.

However, since no single strong correlation was observed with a specific descriptor and to avoid overfitting by using all of them together, three separate groups were used, each containing a subset of molecular descriptors (full list in ). In the preliminary screening, utilizing only ASA and ASA(n + 1) (SASA-group) resulted in balanced accuracies of around 60%. By adding three more descriptors (Area_amide, Amide_pKa and Asp_pKa) to form the Static-group, accuracy generally increased by about 5%. These descriptors can be obtained from static structures without the need for protein sampling, making them more computationally efficient. On the other hand, Frac_amide, CA_rmsd and SC_rmsd require more than one conformation.

Table 2. Overview of molecular descriptor combinations for predicting chemical degradation.

A small subset of descriptors, referred to as the Dynamic-group, was selected based on their importance in the final binary model. The Binary QSAR technique implemented in Molecular Operating Environment (MOE) assigns weights to these descriptors to distinguish between active and inactive molecules.Citation37,Citation38 Two descriptors, Area_amide and Frac_amide, despite exhibiting a good correlation to the degradation rate (), were excluded from the Dynamic-group due to their low weight contribution to the final model. Asp_pKa was removed for the same reason. The combination of descriptors in the Dynamic-group demonstrated superior predictive power compared to all other preliminary models.

The use of static homology models for predicting chemical changes

The first step in building predictive Asx degradation models involved using homology models of variable regions (Fv) that were built from available sequences and did not account for protein conformational variability. The SASA-group, which requires minimal computational effort, was used as a reference predictor set. Combined with homology models, degradation was predicted with a balanced accuracy of 62% for Asn (with an area under the curve, or AUC, of 0.72, ) and 55% for Asp (AUC of 0.56, ).

Table 3. Metrics for models predicting Asn degradation in validation set. The letters A, B and C refer to the different sets of predictors. ‘A’ represents the SASA-group (including ASA and ASA(n + 1)), ‘B’ denotes the Static-group (including ASA, ASA(n + 1), Amide_pka, and Area_amide), and ‘C’ refers to the Dynamic-group (including ASA, SC_rmsd, Ca_rmsd, and Amide_pKa). The sequence-based approach classifies NG and NS as the only two motifs susceptible to degradation.10

Table 4. Metrics for models predicting Asp degradation in validation set. The letters A, B and C correspond to different sets of predictors. ‘A’ represents the SASA-group (including ASA and ASA(n + 1)), ‘B’ denotes the Static-group (including ASA, ASA(n + 1), Amide_pka, Asp_pka and Area_amide), and ‘C’ refers to the Dynamic-group (including ASA, SC_rmsd, Ca_rmsd, and Amide_pKa). The sequence-based approach classifies DG and DS as the only two motifs susceptible to degradation.Citation10.

The second predictor set, denoted as the Static-group, provides additional molecular information about the backbone amide NHn +1. For Asp, it also includes a computational evaluation of the residue’s side chain acidity. This set slightly improved the prediction of Asp degradation, increasing balanced accuracy from 55% to 59%. However, it slightly reduced the accuracy for Asn degradation prediction, with balanced accuracy decreasing from 62% to 60%.

Unlike the ASA value of Asx, which reflects whether the residue is exposed or buried inside the protein, the descriptors Amide_pKa and Asp_pKa depend on local structural properties, such as hydrogen bonds between backbone amide and backbone carbonyl groups. Single homology models do not adequately describe these local structures, which fluctuate over time. Besides balanced accuracy and the AUC, the True Positive Rate (TPR, the probability that an actual liable motif will test liable in the model) was considered to evaluate the model’s ability to accurately identify degraded motifs. For Asn motifs, TPR from SASA-group and Static-group was 41% and 47%, respectively. For Asp motifs, the TPR was 33% for both predictor sets. Despite TPR being used to evaluate the performance of the models in predicting liable motifs, balanced accuracy and AUC were mainly considered to select the best set of descriptors among all generated models discussed below.

Degradation predictions via conformational sampling generally outperforms the use of only homology models

We applied three different sampling methods to explore antibody conformations, always starting from the homology models. This approach should yield higher accuracies, as all chemically degraded motifs of interest are located in the highly flexible CDRs. Conformational sampling provides a more realistic representation about the Asx residues’ reactive conformations.

With kinetic energy mainly distributed on the low-frequency vibrational modes, Low-Mode MDCitation35 is able to generate many different conformations in a short period of computational time (see Methods section for details). In contrast, atomistic MD simulations require increased computational time to create trajectories that are long enough for significant conformational rearrangements to happen. The Stochastic Titration (ST) protocolCitation39 balances these sampling techniques. It applies Low-Mode MD and proton affinity calculationsCitation40 in an iterative way to sample conformations at a desired pH value. As the chemical degradation of Asx residues is highly pH-dependent, better predictions are expected using the ST technique. In the following sections, we evaluate these three sampling methods for modeling Asx degradation.

Low-mode MD

Using Low-Mode MD, we initially sampled 50 conformations for each homology model. The results showed that the balanced accuracy did not improve significantly for the SASA-group, but it did increase for the Static-group from 59% to 71% with an AUC from 0.63 to 0.75 for Asn degradation. However, there was no increase for Asp degradation (balanced accuracy remained at 59%), although the AUC increased from 0.59 to 0.65 ().

Figure 3. Comparison of QSAR models predicting Asn and Asp degradation in validation set. The models were generated using different sampling techniques and predictor sets: energy-minimized homology models (hom), conformational ensembles obtained by Low-ModeMD at pH 6 with 50 (L50) or 400 (L400) structures, conformational ensemble obtained by the stochastic titration protocol (ST-pH6),Citation39 and sampling within 200 ns of classical molecular dynamics simulation. In addition to pH 6.0, the ST protocol was also applied at pH 5.5 and pH 8.5 (ST-stress). The classification threshold of the best ST-model was further reduced to 0.3 (ST-thres03). The predictor sets are categorized into three groups. Suffix A: SASA-group, suffix B: static-group, suffix C: dynamic-group. The sequence-based approach classifies all Asx residues with a subsequent glycine or serine as liable.Citation10 The error bar shows the standard error of the metrics from 15-fold cross validation.

Figure 3. Comparison of QSAR models predicting Asn and Asp degradation in validation set. The models were generated using different sampling techniques and predictor sets: energy-minimized homology models (hom), conformational ensembles obtained by Low-ModeMD at pH 6 with 50 (L50) or 400 (L400) structures, conformational ensemble obtained by the stochastic titration protocol (ST-pH6),Citation39 and sampling within 200 ns of classical molecular dynamics simulation. In addition to pH 6.0, the ST protocol was also applied at pH 5.5 and pH 8.5 (ST-stress). The classification threshold of the best ST-model was further reduced to 0.3 (ST-thres03). The predictor sets are categorized into three groups. Suffix A: SASA-group, suffix B: static-group, suffix C: dynamic-group. The sequence-based approach classifies all Asx residues with a subsequent glycine or serine as liable.Citation10 The error bar shows the standard error of the metrics from 15-fold cross validation.

Next, we increased the sampling from 50 to 400 conformations. This number of conformations was chosen as a compromise between having a sufficient conformational distribution and a reasonable computational effort. In fact, obtaining more conformations might not be compatible with timelines expected during R&D phases. The increase in conformations slightly improved the accuracy for some descriptor combinations for Asn degradation. We then evaluated the performance of the most predictive descriptors for the generated QSAR models. The Dynamic-group, which includes descriptors such as ASA, Amide_pKa, CA_rmsd, and SC_rmsd, showed high performance metrics in the validation set without overfitting. Increasing the conformational sampling from 50 to 400 conformations improved the balanced accuracy for Asn motifs from 62% to 71% (AUC from 0.70 to 0.80). However, the balanced accuracy for the Asp model decreased from 62% to 59% (AUC from 0.73 to 0.67, ).

In summary, for Asp degradation models, the Dynamic-group was more predictive with the smaller 50-conformation dataset and did not improve with the 400-conformation dataset. The Static-group yielded similar results for both 50-conformation (59% balanced accuracy and 0.65 AUC) and 400-conformation (60% balanced accuracy and 0.63 AUC) datasets. For the Asn models, the ASA-/Dynamic-groups showed better performance with the larger conformation ensemble. Additionally, all three predictor groups for both Asn and Asp models had higher accuracies than any QSAR model evaluated on the homology models, indicating that sampling generally improves the predictions.

Stochastic titration

Besides sampling antibody conformations at a fixed pH value, we also investigated how ST influences the conformation, and consequently, the molecular descriptors of interest by altering the protonation states of residues. The ST protocol allows simulation of the stress conditions of forced degradation experiments, as the protonation state is recalculated for each protein conformation within the generated ensemble.Citation39 We sampled 1000 conformations, evenly distributed between pH 4 and 10. After calculating the molecular properties for all conformations, an ensemble average is built, considering the desired pH value and the energy associated with the given conformation. For both Asn and Asp motifs, the Dynamic-group was found to be the most predictive group. Among all three chosen reference pH values (5.5, 6.0, and 8.5), the Asn prediction improved by approximately 4% in balanced accuracy and by 0.09 in AUC at pH 8.5, which is the pH where deamidation was triggered in the experiments. While the AUC for Asn models increased from 0.80 to 0.84 when transitioning from Low-Mode MD sampling to ST, Asp degradation models did not improve. The predictive model for Asp also displayed a difference between the two target values of pH 5.5 (54% balanced accuracy, 0.60 AUC) and pH 6.0 (59% balanced accuracy and 0.65 AUC, ).

In summary, models for Asn deamidation performed better with an increased number of sampled conformations and at the basic pH value of 8.5. Notably, the dataset for Asn was experimentally evaluated at the same pH value, suggesting that structural parameters responsible for deamidation are more accurately estimated under basic conditions. In contrast, models for Asp isomerization showed opposite trends with Low-Mode MD and ST. Low-Mode MD led to the best results using a smaller conformation dataset, while ST using pH 5.5, which is the condition of the experimental forced degradation study, resulted in approximately 5% lower balanced accuracy.

The structure-based model was assessed on frequently occurring motifs within CDR regions. After aligning 57 mAb sequences, four positions showed an Asx residue with at least 30% occurrence. Excluding motifs with a consecutive proline, the model correctly predicted 87 of 90 stable residues, but failed to detect any of the 5 degraded motifs. The model’s limitations in predicting Asx degradation for residues at conserved CDR sequence positions could be attributed to several factors. Firstly, the model may be constrained by the limited data on rare germline residues that degrade. Secondly, the model may be biased toward common chemically stable germline residues due to their higher frequency in the training data, leading to a lower predictive power for rare residues. Lastly, the features used to train the model may not adequately capture the characteristics of rare germline residues that degrade. These factors, coupled with the use of a single canonical conformation in the initial homology model, might interfere with accurately evaluating molecular descriptors for highly conserved residues. Although the data is limited, these findings underscore the need for further investigation and refinement of the model, particularly in relation to degrading residues at germline positions.

Molecular dynamics simulations

After simulating the Fv of the antibodies for 200 ns, we extracted 300 conformations and averaged the eight key descriptors along the trajectory. The prediction models based on these molecular features did not significantly improve compared to the previously mentioned models. For predicting Asp isomerization, the model from MD simulations using predictors from the Dynamic-group showed 62% balanced accuracy with an AUC of 0.67, which is only a marginal improvement to the model obtained from Low-Mode MD (59% balanced accuracy and 0.67 AUC). For predicting Asn deamidation, the data from MD simulations generated a model with a balanced accuracy of 68% and AUC of 0.77 (). This model was less predictive compared to the model from stochastic titration (71% balanced accuracy and 0.84 AUC). In conclusion, since the isomerization prediction improved only slightly and deamidation prediction worsened compared to other sampling methods, the additional computational time required for running the MD simulations is not justified by the obtained accuracy.

Combination of sequence- and structure-based computed features enhances degradation model accuracy

Of all sampling techniques and predictor groups we applied, the Dynamic-group showed good performance in predicting Asn and Asp degradation at pH 8.5 and 6.0, respectively (see ). This set of predictors was chosen to test the QSAR model on an independent dataset composed of 11 in-house mAbs (defined as the test set), which have a total of 19 Asn and 38 Asp motifs within the Fv regions. Among these, 2 Asn and 10 Asp motifs were degraded. The QSAR model achieved a balanced accuracy of 68% for Asp with an AUC of 0.68. For Asn, both degraded motifs were correctly identified, with only one false positive among the 17 undegraded motifs (balanced accuracy over 90% due to fewer data points for Asn).

Although the in-house mAbs test set yielded acceptable accuracies, demonstrating the capability of the QSAR model to generalize to unseen data, we found that combining the QSAR model with a sequence-based classification from experiments could enhance the predictive models. We used the sequence-based classification as a benchmark to assess whether the QSAR model enhances the final combination model. Previous studies have shown that the residue following Asn and Asp strongly influences the degradation rate.Citation26,Citation28 Glycine and serine are the most critical residues for triggering degradation due to steric and catalytic effects.Citation10,Citation17,Citation20

For Asn, a sequence-based classification, which assumes only NG and NS motifs as liable, showed a TPR of 52% and a balanced accuracy of 71%. In comparison, the QSAR model had a TPR of 58% and the same balanced accuracy of 71% (). The performances are thus comparable but based on different knowledge: the QSAR model uses the molecular descriptors (ASA, Amide_pKa, CA_rmsd and SC_rmsd) focused on characterizing the molecular environment around the Asx residue, while the sequence-based classification only uses the identity of the C-terminal flanking residue. We aimed to combine both classifications to improve accuracy.

Initial attempts using decision trees with only two branches and the original QSAR model were unsuccessful (data not shown), as the combination model led to similar evaluation metrics as the sequence-based model. To obtain better balanced accuracy values, we optimized the threshold that converts the QSAR model probability into the predicted class. The trade-off between TPR and the True Negative Rate (TNR, the probability that an actual inactive motif will test negative in the model) becomes apparent as the threshold changes. When the threshold is set to 1, all data points are classified as negative, which results in a TNR of 100%, a TPR of 0%, and a balanced accuracy of 50%.

Reducing the threshold to 0.3 was found to enhance balanced accuracy (). For this reason, the QSAR model with this threshold was chosen to be incorporated into the decision tree. For Asn, while sequence-based prediction yielded 71% balanced accuracy, the combined model achieved instead a balanced accuracy of 73% and decreased false positives from 10% to 6%. To further improve the accuracy of the combination model, we took into account the findings from Lu et al.,Citation10 where the authors identified a strong correlation between frequently or rarely modified sequence positions. They classified frequently degraded motifs in the heavy chain at positions 54 and 98, and in the light chain at positions 30 to 30F (Kabat-Chothia annotation) as hotspots that typically degrade (accounting for 36 of the 73 degraded motifs in the dataset). In contrast, DS or DD motifs at position 61 in CDR heavy chain H2 were defined as coldspots due to their low tendency for chemical degradation (only 1 of the 42 coldspots was degraded). shows correlations between all molecular descriptors, classification models (QSAR, sequence-based, hot-/coldspots) and binary degradation rates. There is a significant pairwise correlation between the binary degradation rate and three classification methods: QSAR model, sequence-based prediction, and hotspots. Coldspots are poorly correlated with the binary degradation rate, mainly due to their low occurrence in the dataset. Moreover, there is no significant correlation between the descriptor ASA(n + 1) and degradation, suggesting that the importance of the successor amino acid in the degradation process is not prevalently related to its exposure to solvent. In this case, using the sequence-based knowledge of the XG/XS motif presence would probably contribute more to increasing the degradation prediction. The Amide_pKa descriptor is more correlated to the degradation rate for Asn compared to Asp, potentially accounting for the enhanced strength of the Asn QSAR model.

Figure 4. Accuracy metric of the model built from the Dynamic-group and sampling with ST-protocol at (a) pH 8.5 for Asn and (b) pH 6 for Asp in validation set (TNR: true negative rate, probability that an actual stable motif will test negative in the model; TPR: true positive rate, probability that an actual degraded motif will test positive in the model). The prediction probability from the QSAR model was converted to a binary classification using five different thresholds. The error bar shows the standard error of the metrics from 15-fold cross validation.

Figure 4. Accuracy metric of the model built from the Dynamic-group and sampling with ST-protocol at (a) pH 8.5 for Asn and (b) pH 6 for Asp in validation set (TNR: true negative rate, probability that an actual stable motif will test negative in the model; TPR: true positive rate, probability that an actual degraded motif will test positive in the model). The prediction probability from the QSAR model was converted to a binary classification using five different thresholds. The error bar shows the standard error of the metrics from 15-fold cross validation.

Figure 5. Correlation matrices of the descriptors obtained by applying stochastic titration at pH 6 and 8.5 for Asn and Asp, respectively. The Pearson’s correlation coefficient multiplied by 100 is shown. (a) Correlation between descriptors from 260 Asn motifs. (b) Correlation between descriptors from 192 Asp motifs. The QSAR model was built with the Dynamic-group.

Figure 5. Correlation matrices of the descriptors obtained by applying stochastic titration at pH 6 and 8.5 for Asn and Asp, respectively. The Pearson’s correlation coefficient multiplied by 100 is shown. (a) Correlation between descriptors from 260 Asn motifs. (b) Correlation between descriptors from 192 Asp motifs. The QSAR model was built with the Dynamic-group.

Considering these correlations, we incorporated the hot-/coldspot knowledge-based information as the first branch of the decision tree in the combination model. By integrating these classification rules, we developed a decision tree called FRIDA (“Forecasting Reactivity of Isomerization and Deamidation in Antibodies”) that enhances prediction accuracy by combining structure-based and sequence-based information ().

Figure 6. Decision tree for evaluating reactivity of Asn/Asp motifs. The decision tree uses three branches to assess the reactivity. First, the hot-/coldspot definitions from Lu et al. Citation10 are applied to classify Asx at positions H54, H98 and L30-L30F, as well as DS and DD motifs at position H61. Second, the structure-based QSAR model evaluates whether the motif has a high probability of being stable. If the outcome of the QSAR model exceeds the probability threshold of 0.3, the sequence is then screened for liable motifs (XG and XS).

Figure 6. Decision tree for evaluating reactivity of Asn/Asp motifs. The decision tree uses three branches to assess the reactivity. First, the hot-/coldspot definitions from Lu et al. Citation10 are applied to classify Asx at positions H54, H98 and L30-L30F, as well as DS and DD motifs at position H61. Second, the structure-based QSAR model evaluates whether the motif has a high probability of being stable. If the outcome of the QSAR model exceeds the probability threshold of 0.3, the sequence is then screened for liable motifs (XG and XS).

When we applied FRIDA to the in-house test set of 11 mAbs of Asp motifs, the balanced accuracy increased from 68% to 79% (AUC increased from 0.68 to 0.75) compared to only using the QSAR model. For Asn motifs, FRIDA performed at the same level as the QSAR model alone, with 18 of 19 motifs classified correctly. When examining the dataset from Lu et al.,Citation10 the balanced accuracy of the QSAR model alone increased from 71% to 79% for Asn motifs and from 59% to 77% for Asp motifs by using the decision tree. This combined approach effectively addresses the uncertainties of the QSAR model for Asp in detecting reactive motifs from the dataset while maintaining its strength in filtering out inactive motifs. As the tree is independent of the type of structure-based model, it can also serve as a potential approach for other structure-based prediction models.

Discussion

Our study introduces a novel in silico approach for predicting deamidation and isomerization sites in CDRs of therapeutic antibodies, which has important implications for the development of more stable and effective drugs. Previous sequence-based methods often overpredicted,Citation10,Citation26,Citation29 while the consideration of a structure-based description improved predictive response.Citation30 We suggest further enhancement with conformational sampling using Low-Mode MDCitation35 or STCitation39 within the MOE suite,Citation37 balancing conformational sampling with computational cost and time. Our workflow, available on the CCG SVL Exchange website (http://svl.chemcomp.com), is user-friendly and accessible to the scientific community. We introduce a decision tree, FRIDA, integrating the structure-based QSAR model with two additional knowledge-based classification rules. This model combines the strengths of structure-based and sequence-based approaches, with the QSAR model reducing the high false positive rate of the sequence-based approach. This new combined approach allows for modular improvements and inclusion of additional motifs or updated QSAR models. In our QSAR model, we identified three molecular feature classes as being the most predictive: 1) the solvent exposure of Asx, 2) the acidity of the flanking backbone nitrogen, and 3) the flexibility of both the entire residue side chain and the alpha carbon only. The importance of solvent accessible area and flexibility as descriptors for predicting chemical degradation has been previously reported by numerous research groups.Citation20,Citation26,Citation30,Citation32 The propensity of the flanking amide group to release a proton has been indirectly accounted for in previous studies, typically through calculations such as the number of hydrogen bonds or the distance to the side chain oxygen atom in Asx.Citation20,Citation26,Citation30 In our study, we directly estimated the pKa of the amide group using ensemble-based estimates, which inherently consider pH and chemical environment.Citation41 This descriptor might provide a more accurate evaluation of the protonation state of amide groups with respect to descriptors used in previous works. Other molecular features used in predictive models in existing literature are indirectly included in our set of descriptors. For instance, the secondary structure of Asx or of its flanking residue has been widely used as a categorical variable for prediction of Asx reactivity.Citation20,Citation26,Citation32 It is well-known that deamidation is less likely to occur efficiently if the involved residue is part of an alpha-helix.Citation42 In our approach, the effects of a more constrained environment are factored in by considering the flexibility of the residues (RMSD, either of alpha carbon or side chain) and their exposure to solvent (ASA descriptors).

We intentionally excluded descriptors specifically designed to characterize CDRs, such as CDR length or type,Citation20 to create a model that could be generalized to other proteins. As a result, the predictive model we developed can be used to evaluate degradation likelihood not only in regions outside of the CDRs, but also in non-mAb molecules.Citation26,Citation43 Our QSAR model is based on 3D molecular descriptors, which are generally applicable to any protein. However, this may not be the case for the combination model, where the hotspots and coldspots are closely tied to the alignment of the antibody variable region and the specific position of the residue within the protein folding.

Contrary to a previous study,Citation20 we did not filter motifs with degradation propensities near the binary classification cutoff during data selection. This intentional decision increased prediction error, but enhanced our model’s generalization and reduced bias. To further mitigate overprediction of stable motifs, we balanced the dataset before QSAR model construction. Our method could still be improved by incorporating ensemble-based molecular descriptors specifically addressing Asp isomerization, which proved more challenging to predict than Asn deamidation. Cost-effective sampling methods like Low-Mode MD or ST could facilitate the exploration of such novel descriptors. Moreover, the application of the structure-based approach on different initial homology models displaying different CDR canonical conformations could improve the lower accuracy prediction for unstable motifs observed on Asx located at conserved sequence positions.

The QSAR model can be successfully applied with a feedback loop from experiments, which will serve to validate and fine-tune the predictions. This complementary approach ensures a more reliable and precise output, thereby increasing the overall effectiveness of the drug discovery and development processes.

Incorporating the ability to predict deamidation and isomerization sites in therapeutic antibodies into a quality-by-design framework allows researchers to identify, rank, and engineer potential lead candidates based on potential degradation sites, ultimately selecting more chemically stable mAbs during early development stages. By comprehending the factors that influence deamidation and isomerization, researchers can create formulations that minimize these degradation pathways, consequently enhancing the stability and shelf-life of therapeutic antibodies. Overall, this approach can decrease the risk of subsequent failure due to instability, immunogenicity, or diminished efficacy.

Materials and methods

Dataset preparation

The training dataset was obtained from the literature.Citation10 To analyze the chemical degradation of CDRs, the authors performed forced degradation studies at pH 5.5 and 40°C for two weeks and at pH 8.5 and 40°C for one week for Asp and Asn, respectively. Of the 131 analyzed clinical-stage monoclonal antibodies, 57 showed at least one degradation site, leading to a total of 91 degradation sites in CDRs and 3 in framework regions. Since the dataset is highly unbalanced with a strong majority of residues being chemically stable, only the 57 mAbs with at least one degradation-prone motif were selected for this study. Furthermore, all NX and DX motifs containing a proline residue as X were excluded because the backbone amide is not available for the degradation reaction, leading to these motifs being highly stable. The regions of interest are CDRs, because these regions act as the unique sequences that are responsible for antibodies’ specificity and efficacy. In contrast to the highly conserved framework regions, degradations in CDRs are frequent and can lead to significant loss of functionality. Therefore, only residues within CDRs were considered, leading to 260 Asn and 192 Asp motifs in the dataset. For the classification algorithm, motifs with degradation rates of 2.0% or higher were defined as degraded (liable, active), and those with lower rates as not degraded (not liable, inactive). This 2.0% threshold was chosen based on the signal intensity cutoff of experiments reported in the reference paper.Citation10 Raising the threshold did not enhance the model’s performance (see Figure S1).

Forced degradation studies on the in-house mAbs (test set used for the predictive model development) were performed at 40°C for 1 week at pH 9 and pH 6 for Asn and Asp, respectively. After tryptic digestion, mass spectroscopy was conducted to obtain the ratio of degraded Asx residues. We defined liable motifs as those with a degradation rate above 3.0%.

Homology modeling

We collected Fv regions from the selected 57 mAbs and built homology models via the Antibody Modeler module of MOE 2020.09.Citation37 The force field Amber10:EHT was selected together with the Generalized Born implicit modelCitation41 for solvation with internal and external dielectric values set respectively to 4 and 80. Nonbonded interactions were turned off using a switching function between 10 and 12 Å. The C-termini of Fv regions were capped with an amide group. After model generation, we applied structure preparation in MOE to remove bad contacts and to adjust the protonation state to pH 6.0 and ion concentration of 40 mM using the Protonate3D protocol.Citation40 All final structures were energy minimized below a gradient threshold of 10–6 kcal/mol·ÅCitation2.

Low-mode molecular dynamics

The homology models were used as a starting point for molecular sampling via the Low-Mode MDCitation35 function in MOE. Briefly, this protocol uses a short (≈1 ps) run of MD at constant temperature with a following energy minimization step. With kinetic energy mainly distributed in the low-frequency vibrational modes, the molecular system reaches many different conformations in a comparable short period of time. All Low-Mode MD searches were conducted at the typical formulation condition of pH 6 for either 50 or 400 conformations.

We also used the ST protocolCitation39 implemented in MOE, where steps of Low-Mode MD and Protonate3DCitation40 are alternately performed. The first step uses a constant-temperature conformational sampling at a given protonation state, whereas the second step applies a Monte Carlo sampling to assign the protonation state for the given conformation maintaining fixed coordinates. In such a way, the grand-canonical ensemble is preserved while mimicking the experimental conditions of forced degradation studies more accurately using different pH values. For each molecule, 1000 conformations were sampled between pH 4 and pH 10. The pH-dependent ensemble averages were then obtained for molecular descriptors at a desired pH via averaging the weighted descriptor values from all conformations. The weight depends on the energy of the conformation, as well as the number of all hydrogen atoms present at each titratable site.

Molecular dynamics simulations

For classical atomistic MD simulations, the homology models were solvated in a rectangular box with a minimum distance from the box side of 18 Å, neutralized and mixed with 40 mM NaCl in Visual Molecular Dynamics (VMD).Citation44 For equilibration and production simulations, NAMD2 and NAMD3 were used, respectively.Citation45 The CHARMM36m force fieldCitation46 was used for protein, whereas the TIP3P modelCitation47 was used for water. Nonbonded interactions were evaluated with a cutoff of 12 Å and a switching function starting at 10 Å. For long-range electrostatic interactions, the Particle Mesh Ewald (PME)Citation48 was used with a grid density of 1 Å−3. Rigid bonds for all hydrogen atoms were applied with SHAKE algorithm.Citation49 Pressure was controlled with a modified Nosé-Hoover Langevin pistonCitation50 at 1 atm and temperature was kept at 300 K with a Langevin thermostat. The simulation timestep was fixed to 2 fs, calculating Lennard-Jones forces at every timestep, but updating PME forces every two timesteps. For equilibrating the Fv structures, a multistep approach was applied as in previous studiesCitation51: (1) 2000 steps conjugate gradient energy minimization with 20 ps NVT simulation afterward restraining all but water with a high force constant of 10 kcal/mol·ÅCitation2; (2) 20 ps in NPT keeping the same restraints; (3) 20 ps NPT restraining only protein with the previous force constant; (4) 100 ps with a lower force constant of 2.5 kcal/mol·ÅCitation2 assigned to heavy atoms of protein; (5) 100 ps restraining protein backbone with the previous lower force constant; (6) 100 ps restraining protein backbone with a force constant of only 1 kcal/mol·ÅCitation2; (7) 100 ps without any restrains. Using the last snapshot of step (7) as initial coordinates and velocities, the production simulations ran for 200 ns per antibody. Analysis of the trajectories was performed in VMD by extracting 300 snapshots from the last 180 ns of each trajectory, after having centered and aligned the protein in the simulation box.

Descriptor calculation and model evaluation

Descriptor calculations for all the structures (homology models and ensembles) were performed with the QuaSAR_DescriptorMDB function in MOE.Citation37 For the MD ensemble and prior to descriptor calculation, the snapshots from the trajectories underwent the MOE structure preparation protocol to apply the default Amber10:EHT force field to the system; subsequently, a short energy minimization was performed to reach a convergence criterion of 1.0 kcal/mol·ÅCitation2.

Using all relevant residue-based descriptors together with the experimentally determined degradation rates that were converted to binary values, we built a binary prediction model based on Bayes Theorem to estimate the probability of a motif having an elevated risk of degradation.Citation38 As the original dataset is unbalanced, we chose to resample the dataset to include all liable motifs and part of the inactive ones. We set a ratio of 2:3 for active:inactive, which leads to all datasets consisting of 40% degraded motifs and 60% inactive motifs. This selection is a trade-off between having a large amount of data for training/validation of the models and a low possible level of imbalance among the two categorical classes. After splitting the dataset into 75% for training and 25% for validation, we obtained a population of each class as reported in .

Table 5. Composition of the datasets in terms of population classes.

We performed this splitting 15 times (15-fold Monte Carlo cross-validation) to randomly shuffle training and validation sets and obtain an unbiased model accuracy. The error bars shown in the figures of the Results section represent the standard error of the mean derived from these cross-validations. It was calculated by dividing the standard deviation by the square root of 15 and provides an estimate of the difference between the sample and population means. As the model’s output gives an activity probability, we set the threshold to 0.5 to classify the degradation propensity of each motif. In the decision tree model, the threshold was lowered to 0.3 instead.

To assess our model’s predictive performance, we primarily used the area under the receiver operating curve (AUC), which plots the TPR (sensitivity) against the false positive rate as a function of the applied classification threshold. We avoided using total accuracy, as Asx degradation datasets are inherently unbalanced, and a model predicting all motifs as negative would still yield high total accuracy. Instead, we employed balanced accuracy, the mean of the TPR and the TNR (specificity). Since the main goal of the prediction model was to maximize the correct prediction of liable motifs, we decided to use sensitivity, balanced accuracy, and AUC as the criteria to evaluate the performance of all models. Besides these three metrics, we also reported other evaluation metrics for information only. Precision refers to true positive predictions divided by the sum of true positive and false positive predictions and reports about the reliability of a positive prediction. In addition, the error of the model is reported as the total number of false predictions divided by the total number of data points, which implies that total accuracy and total error sum up to one. Finally, the F1 score was calculated by the harmonic mean of sensitivity and precision meaning two times the product of sensitivity and precision divided by the sum of them.

Abbreviations

Asn (N)=

Asparagine

Asp (D)=

Aspartic Acid

ASA=

Accessible Surface Area

AUC=

Area under the curve

Cα=

Alpha Carbon in Asn/Asp side chain

Cγ=

Gamma Carbon in Asn/Asp side chain

CDR=

Complementary-Determining Region

CQAs=

Critical Quality Attributes

Fv=

Variable region

Iso-Asp=

Iso-aspartic acid

mAbs=

Monoclonal Antibodies

MD=

Molecular Dynamics

NHn + 1=

Backbone Nitrogen of successor amino acid

NPV=

Negative predictive value

OH=

Hydroxyl Ion

QSAR=

Quantitative Structure-Activity Relationship

R&D=

Research and Development

RMSD=

Root Mean Square Deviation

ROC-curve=

Receiver Operating Characteristic Curve

SASA=

Solvent Accessible Surface Area

TN=

True Negative

TNR=

True Negative Rate

TP=

True Positive

TPR=

True Positive Rate

Supplemental material

Supporting_Information_revision.docx

Download MS Word (127.7 KB)

Acknowledgments

This research was supported by Boehringer Ingelheim Pharma GmbH & Co KG. We gratefully acknowledge the whole In Silico Team (IST) at Boehringer Ingelheim for excellent discussions.

Disclosure statement

D. Hoffmann, J. Bauer, A.R. Karow-Zwick and G. Licari were employees of Boehringer Ingelheim Pharma GmbH & Co.KG during the project.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/19420862.2024.2333436

Additional information

Funding

This research was supported by Boehringer Ingelheim Pharma GmbH & Co KG.

References

  • Lu R-M, Hwang Y-C, Liu I-J, Lee C-C, Tsai H-Z, Li H-J, Wu H-C. Development of therapeutic antibodies for the treatment of diseases. J Biomed Sci. 2020;27(1):1. doi:10.1186/s12929-019-0592-z.
  • DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ. 2016;47:20–12. doi:10.1016/j.jhealeco.2016.01.012.
  • Hay M, Thomas DW, Craighead JL, Economides C, Rosenthal J. Clinical development success rates for investigational drugs. Nat Biotechnol. 2014;32(1):40–51. doi:10.1038/nbt.2786.
  • Bailly M, Mieczkowski C, Juan V, Metwally E, Tomazela D, Baker J, Uchida M, Kofman E, Raoufi F, Motlagh S. et al. Predicting antibody developability profiles through early stage discovery screening. Mabs-austin. 2020;12(1):1743053. doi:10.1080/19420862.2020.1743053.
  • Jarasch A, Koll H, Regula JT, Bader M, Papadimitriou A, Kettenberger H. Developability assessment during the selection of novel therapeutic antibodies. J Pharm Sci. 2015;104(6):1885–98. doi:10.1002/jps.24430.
  • Harris RJ, Kabakoff B, Macchi FD, Shen FJ, Kwong M, Andya JD, Shire SJ, Bjork N, Totpal K, Chen AB. Identification of multiple sources of charge heterogeneity in a recombinant antibody. J Chromatogr B Biomed Sci Appl. 2001;2001(2):233–45. doi:10.1016/s0378-4347(00)00548-x.
  • Yan B, Steen S, Hambly D, Valliere-Douglass J, Vanden Bos T, Smallwood S, Yates Z, Arroll T, Han Y, Gadgil H. et al. Succinimide formation at asn 55 in the complementarity determining region of a recombinant monoclonal antibody IgG1 heavy chain. J Pharm Sci. 2009;98(10):3509–21. doi:10.1002/jps.21655.
  • Gervais D. Protein deamidation in biopharmaceutical manufacture: understanding, control and impact. J Chem Technol Biot. 2016;91(3):569–75. doi:10.1002/jctb.4850.
  • Alam ME, Barnett GV, Slaney TR, Starr CG, Das TK, Tessier PM. Deamidation can compromise antibody colloidal stability and enhance aggregation in a pH-dependent manner. Mol Pharm. 2019;16(5):1939–49. doi:10.1021/acs.molpharmaceut.8b01311.
  • Lu X, Nobrega RP, Lynaugh H, Jain T, Barlow K, Boland T, Sivasubramanian A, Vásquez M, Xu Y. Deamidation and isomerization liability analysis of 131 clinical-stage antibodies. Mabs-austin. 2018;11(1):1–13. doi:10.1080/19420862.2018.1548233.
  • Liu YD, van Enk JZ, Flynn GC. Human antibody fc deamidation in vivo. Biologicals. 2009;37(5):313–22. doi:10.1016/j.biologicals.2009.06.001.
  • Yang N, Tang Q, Hu P, Lewis MJ. Use of in vitro systems to model in vivo degradation of therapeutic monoclonal antibodies. Anal Chem. 2018;90(13):7896–902. doi:10.1021/acs.analchem.8b00183.
  • Capasso S, Kirby AJ, Salvadori S, Sica F, Zagari A. Kinetics and mechanism of the reversible lsomerization of aspartic acid residues in tetrapeptides. J Chem Soc Perkin Trans. 1995;2(3):437. doi:10.1039/p29950000437.
  • Konuklar FAS, Aviyente V. Modelling the hydrolysis of succinimide: formation of aspartate and reversible isomerization of aspartic acid via succinimide. Org Biomol Chem. 2003;1(13):2290–97. doi:10.1039/b211936f.
  • Geiger T, Deamidation CS. Isomerization, and racemization at Asparaginyl and aspartyl residues in peptides. J Biol Chem. 1987;262(15):785–94. doi:10.1016/S0021-9258(19)75855-4.
  • Aswad DW, Paranandi MV, Schurter BT. Isoaspartate in peptides and proteins: formation, significance, and analysis. J Pharm Biomed Anal. 2000;21(6):1129–36. doi:10.1016/s0731-7085(99)00230-7.
  • Wakankar AA, Borchardt RT, Eigenbrot C, Shia S, Wang YJ, Shire SJ, Liu JL. Aspartate isomerization in the complementarity-determining regions of two closely related monoclonal antibodies. Biochemistry. 2007;46(6):1534–44. doi:10.1021/bi061500t.
  • Sreedhara A, Cordoba A, Zhu Q, Kwong J, Liu J. Characterization of the isomerization products of aspartate residues at two different sites in a monoclonal antibody. Pharm Res. 2012;29(1):187–97. doi:10.1007/s11095-011-0534-2.
  • Sharma VK, Patapoff TW, Kabakoff B, Pai S, Hilario E, Zhang B, Li C, Borisov O, Kelley RF, Chorny I. et al. In silico selection of therapeutic antibodies for development: viscosity, clearance, and chemical stability. Proc Natl Acad Sci U S A. 2014;111(52):18601–06. doi:10.1073/pnas.1421779112.
  • Sydow JF, Lipsmeier F, Larraillet V, Hilger M, Mautz B, Mølhøj M, Kuentzer J, Klostermann S, Schoch J, Voelger HR. et al. Structure-based prediction of asparagine and aspartate degradation sites in antibody variable regions. PloS One. 2014;9(6):e100736. doi:10.1371/journal.pone.0100736.
  • Láng A, Jákli I, Enyedi KN, Mező G, Menyhárd DK, Perczel A. Off-pathway 3D-structure provides protection against spontaneous Asn/Asp isomerization: shielding proteins achilles heel. Q Rev Biophys. 2020;53:e2. doi:10.1017/s003358351900009x.
  • Ince HH, Konuklar FAS, Ugur I, Ozcan ÖA, Sayadi M, Feig M, Aviyente V. Role of the n+1 amino acid residue on the deamidation of asparagine in pentapeptides. Molecular Physics. 2015;113(23):3839–48. doi:10.1080/00268976.2015.1068394.
  • Phillips JJ, Buchanan A, Andrews J, Chodorge M, Sridharan S, Mitchell L, Burmeister N, Kippen AD, Vaughan TJ, Higazi DR. et al. Rate of Asparagine Deamidation in a monoclonal antibody correlating with hydrogen exchange rate at adjacent downstream residues. Anal Chem. 2017;89(4):2361–68. doi:10.1021/acs.analchem.6b04158.
  • DiCara DM, Andersen N, Chan R, Ernst JA, Ayalon G, Lazar GA, Agard NJ, Hilderbrand A, Hötzel I. High-throughput screening of antibody variants for chemical stability: identification of deamidation-resistant mutants. Mabs-austin. 2018;10(7):1–11. doi:10.1080/19420862.2018.1504726.
  • Clarke S. Propensity for spontaneous succinimide formation from aspartyl and asparaginyl residues in cellular proteins. Int J Pept Protein Res. 1987;30(6):808–21. doi:10.1111/j.1399-3011.1987.tb03390.x.
  • Robinson NE, Robinson AB. Prediction of protein deamidation rates from primary and three-dimensional structure. Proc Natl Acad Sci U S A. 2001;98(8):4367–72. doi:10.1073/pnas.071066498.
  • Stephenson RC, Clarke S. Succinimide formation from Aspartyl and asparaginyl peptides as a model for the spontaneous degradation of proteins. J Biol Chem. 1989;264(11):6164–70. doi:10.1016/s0021-9258(18)83327-0.
  • Wright HT, Urry DW. Nonenzymatic deamidation of asparaginyl and glutaminyl residues in proteins. Crit Rev Biochem Mol Biol. 1991;26(1):1–52. doi:10.3109/10409239109081719.
  • Lorenzo JR, Alonso LG, Sánchez IE, Lisacek F. Prediction of spontaneous protein deamidation from sequence-derived secondary structure and intrinsic disorder. PloS One. 2015;10(12):e0145186. doi:10.1371/journal.pone.0145186.
  • Jia L, Sun Y, de Brevern AG. Protein asparagine deamidation prediction based on structures with machine learning methods. PloS One. 2017;12(7):e0181347. doi:10.1371/journal.pone.0181347.
  • Plotnikov NV, Singh SK, Rouse JC, Kumar S. Quantifying the risks of asparagine deamidation and aspartate isomerization in biopharmaceuticals by computing reaction free-energy surfaces. J Phys Chem B. 2017;121(4):719–30. doi:10.1021/acs.jpcb.6b11614.
  • Yan Q, Huang M, Lewis MJ, Hu P. Structure based prediction of asparagine deamidation propensity in monoclonal antibodies. Mabs-austin. 2018;10(6):901–12. doi:10.1080/19420862.2018.1478646.
  • Brennan TV, Clarke S. Spontaneous degradation of polypeptides at aspartyl and asparaginyl residues: effects of the solvent dielectric. Protein Sci. 1993;2(3):331–38. doi:10.1002/pro.5560020305.
  • Brennan TV, Clarke S. Effect of adjacent histidine and cysteine residues on the spontaneous degradation of asparaginyl- and aspartyl-containing peptides. Int J Pept Protein Res. 1995;45(6):547–53. doi:10.1111/j.1399-3011.1995.tb01318.x.
  • Labute P. LowModeMD–implicit low-mode velocity filtering applied to conformational search of macrocycles and protein loops. J Chem Inf Model. 2010;50(5):792–800. doi:10.1021/ci900508k.
  • Martin KP, Grimaldi P, Grempler R, Hansel S, Kumar S. Trends in industrialization of biotherapeutics: a survey of product characteristics of 89 antibody-based biotherapeutics. Mabs-austin. 2023;15(1):2191301. doi:10.1080/19420862.2023.2191301.
  • (Molecular Operating Environment (MOE), 2022.02), (Chemical Computing Group ULC, 1010 Sherbooke St. West, Suite #910, Montreal, QC, Canada, H3A 2R7). 2022.
  • Labute P. Binary QSAR: a new method for the determination of quantitative structure activity relationships. Pac Symp Biocomput. 1999;444–55. doi:10.1142/9789814447300_0044.
  • Baptista AM, Teixeira VH, Soares CM. Constant- p H molecular dynamics using stochastic titration. J Chem Phys. 2002;117(9):4184–200. doi:10.1063/1.1497164.
  • Labute P. Protonate3D: assignment of ionization states and hydrogen coordinates to macromolecular structures. Proteins. 2009;75(1):187–205. doi:10.1002/prot.22234.
  • Labute P. The generalized born/volume integral implicit solvent model: estimation of the free energy of hydration using London dispersion instead of atomic surface area. J Comput Chem. 2008;29(10):1693–98. doi:10.1002/jcc.20933.
  • Kosky AA, Razzaq UO, Treuheit MJ, Brems DN. The effects of alpha-helix on the stability of asn residues: deamidation rates in peptides of varying helicity. Protein Sci. 1999;8(11):2519–23. doi:10.1110/ps.8.11.2519.
  • Irudayanathan FJ, Zarzar J, Lin J, Izadi S. Deciphering deamidation and isomerization in therapeutic proteins: effect of neighboring residue. Mabs-austin. 2022;14(1):2143006. doi:10.1080/19420862.2022.2143006.
  • Humphrey W, Dalke A, Schulten K. VMD: visual molecular dynamics. J Mol Graph. 1996;14(1):33–8, 27–8. doi:10.1016/0263-7855(96)00018-5.
  • Phillips JC, Hardy DJ, Maia JDC, Stone JE, Ribeiro JV, Bernardi RC, Buch R, Fiorin G, Hénin J, Jiang W. et al. Scalable molecular dynamics on CPU and GPU architectures with NAMD. J Chem Phys. 2020;153(4):44130. doi:10.1063/5.0014475.
  • Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, de Groot BL, Grubmüller H, MacKerell AD. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat Methods. 2017;14(1):71–73. doi:10.1038/nmeth.4067.
  • Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. J Chem Phys. 1983;79(2):926–35. doi:10.1063/1.445869.
  • Darden T, York D, Pedersen L. Particle mesh Ewald: an N ⋅log(N) method for Ewald sums in large systems. J Chem Phys. 1993;98(12):10089–92. doi:10.1063/1.464397.
  • Ryckaert J-P, Ciccotti G, Berendsen HJ. Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J Comput Phys. 1977;23(3):327–41. doi:10.1016/0021-9991(77)90098-5.
  • Martyna GJ, Tobias DJ, Klein ML. Constant pressure molecular dynamics algorithms. J Chem Phys. 1994;101(5):4177–89. doi:10.1063/1.467468.
  • Licari G, Martin KP, Crames M, Mozdzierz J, Marlow MS, Karow-Zwick AR, Kumar S, Bauer J. Embedding dynamics in intrinsic physicochemical profiles of market-stage antibody-based biotherapeutics. Mol Pharm. 2023;20(2):1096–111. doi:10.1021/acs.molpharmaceut.2c00838.