Full article: Alternative cancer therapy through modeling pteridines photosensitizer quantum yield singlet oxygen production using swarm-based support vector regression and extreme learning machine

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Photodynamic cancer therapy circumvents the major side effects associated with the conventional cancer treatment methods, such as chemotherapy, surgery and exposure to radiation. Experimental measurement of photosensitizer quantum yield (PQY) singlet production of oxygen through either sensitive laser spectroscopy or luminescence detection at the wavelength of 1270 nm is costly; time consuming and intensive while unreliability of chemical traps experimental approach is of serious concern. Quantitative structure–activity relationship (QSAR) computational method proposed in the literature for computing PQY of singlet oxygen production has characteristics deviation from the measured values. PQY singlet oxygen production of twenty-nine pteridines photosensitizer compounds is modeled and predicted in this present contribution using extreme learning machine (ELM) and support vector regression (SVR) with hybridized particle swarm optimization (PSO) method for ensuring combinatory parameter selection. The performances of the developed SVR-PSO computational method are assessed using mean absolute error (MAE), correlation coefficient (CC), root mean square error (RMSE) and mean absolute percentage deviation (MAPD). The developed SVR-PSO model outperforms QSAR (2016) model with performance superiority of 34.78%, 3.65%, 17.64% and 42.16% on the basis of RMSE, CC, MAE and MAPD performance measuring parameters, respectively. The developed ELM-SINE (with sine activation function) and ELM-SIG (with sigmoid activation function) respectively outperform the existing QSAR (2016) model with improvement of 6.54% and 4.70% using R-squared metric. The demonstrated outstanding performance of the present predictive models is immensely meritorious in strengthening the potentials of alternative cancer therapy and circumventing the experimental challenges of PQY singlet oxygen production determination.

Keywords:

REVIEWING EDITOR:

SUBJECTS:

1. Introduction

Chemical compounds that are subjected to excitation due to the incident light of a certain wavelength are referred to as photosensitizers (Frimayanti et al., Citation2011). Among the interesting features of photosensitizer that have profound usefulness in several applications is the singlet oxygen generation capacity (Buglak et al., Citation2016). Potential areas of singlet oxygen application include fine chemical synthesis, waste water treatment, solar cell and photodynamic cancer therapy among others (Buglak et al., Citation2016). Photodynamic cancer therapy offers a unique and novel method of cancer treatment with relatively known side effect as compared to the conventional methods, such as chemotherapy, surgery and radiation (Mfouo-Tynga et al., Citation2021; Wang, Liang, et al., Citation2021; Wang, Li, et al., Citation2021; Lin et al., Citation2021). This method of cancer treatment targets tumor cell through a combined effect of photosensitizer (which is a toxic drug that can be activated by light) and light. Photons are absorbed by a photosensitive drug and subsequently lead to drug excitation. The energy associated with the excited drug is transferred to oxygen, forming singlet oxygen otherwise known as free radicals which later oxidizes cellular structures (Oliveira et al., Citation2011). The transfer of energy of the excited photosensitizer is resulted from the interaction of triplet and singlet states of the sensitizer. The cancerous cell dies when the oxidative damage resulted from singlet oxygen exceeds the threshold level (Shi et al., Citation2015). The photosensitizer quantum yield (PQY) of singlet oxygen production is computed from the ratio of the amount of generated singlet oxygen by photosensitizer to the amount of absorbed photons. The singlet oxygen generation is influenced by the solvent polarity, oxidation potential, aggregation, energy of the triplet state, electronic configuration, steric, excimer formation and structural effects (Buglak et al., Citation2016; Derosa & Crutchley, Citation2002). Experimental determination of PQY of singlet oxygen production for photosensitizers is carried out through either sensitive laser spectroscopy or through singlet oxygen luminescence detection at the wavelength of 1270 nm. Sensitive laser spectroscopy approach is expensive, time consuming and intensive. Another Indirect experimental approach is to use singlet oxygen chemical traps. Interaction of free radicals, photosensitizer molecules and peroxides with the chemical traps limits the reliability of chemical traps approach of PQY singlet oxygen production determination while the need for the heavy water D₂O for achieving luminescence of singlet oxygen in water for luminescence detection method makes the method costly and laborious (Buglak et al., Citation2016). Quantitative structure–activity relationship (QSAR) has been proposed in the literature to circumvent the experimental challenges associated with PQY of singlet oxygen production while the result of the existing QSAR approach is characterized with high degree of deviation. The proposed extreme learning machine (ELM) algorithm and hybridization of support vector regression (SVR) with particle swarm optimization (PSO) have demonstrated superior performance over the existing model.

Among the well-known photosensitizers for Photodynamic cancer therapy are porfimer sodium and pteridines photosensitizer compounds. Pteridines photosensitizer is a conjugated pyrimidine and pyrazine azoheterocycles structure found in biological systems in a reduced or oxidized form Dantola et al. (Citation2017). The pteridine compounds include the flavin (which is found in dihydro-reduced or oxidized form) and pterin sensitizer which is found in tetrahydro-reduced, half reduced or oxidized form. High potentials for electron and energy transfer, phosphorescence high yields, long lifetime excited states (triplet), photochemical activity and fluorescence characterize the oxidized forms of flavins and pterins photosensitizer. Pterins belong to heterocyclic family of compounds with six-substituted common derivatives. Pterins are classified into unconjugated and conjugated pterins in accordance to substituents functional group and molecular weight (Buglak et al., Citation2016). The conjugated form is characterized with short hydrocarbon chain substituents or those with one carbon atom while the unconjugated form containing larger constituents with aminobenzoic acid part. The potential of Flavins for photodynamic cancer therapy and in generating singlet oxygen is enormous and demonstrated high effectiveness as compared with the commonly used exogenous porphyrins in eradicating unwanted cells. The compositional potentials of pteridines coupled with their availability in living systems call for investigating and enhancement of their potentials in photodynamic cancer therapy using ELM and hybrid SVR and particle warm optimization algorithm presented in this work.

SVR is an intelligence computational method of solving complex real-life problems based on convex quadratic programming (Basak et al., Citation2007; Science et al., Citation2021; Drucker et al., Citation1996). Regularization and kernel method guide the estimation principles of this algorithm. This algorithm decreases the generalization error with the structural risk error minimization principle. This feature coupled with convergence to global solution has rendered the algorithm indispensable in many applications which include transition temperature prediction in Fe-based superconducting system (Akomolafe et al., Citation2021), useful life estimation of lithium-ion battery (Li et al., Citation2021) and energy gap modeling for enhanced photocatalytic application (Owolabi et al., Citation2021). The parameters in SVR algorithm can be carefully selected using different kind of approaches ranging from manual, grid-search and heuristic methods through evolutionary population-based algorithms (Wu & Wang, Citation2020). The evolutionary algorithm implemented in this work is PSO algorithm because of the ease of its implementation, fast convergence, high accuracy and avoidance of premature convergence (Wang, Liang, et al., Citation2021; Wang, Li, et al., Citation2021; Aboelkassem & Savic, Citation2021). The algorithm shares similar properties with genetic algorithm with exception of mutation and crossover operation. Hybridization of this optimization algorithm with SVR algorithm results into robust and precise model for determining pteridines PQY singlet production of oxygen. Another intelligent algorithm employed for modeling PQY singlet production of oxygen in this work is ELM (Huang et al., Citation2006). ELM is a single hidden layer network with intrinsic ability to randomly determine the network bias and input weights while it computes the output weights using Moore Penrose inverse approach. The uniqueness of ELM and hybrid PSO-SVR algorithms is harnessed in this work for PQY singlet production of oxygen determination.

The remaining part of this article is organized into different sections; Section 2 contains the mathematical background of the employed intelligent algorithms coupled with principles governing the implemented optimization technique. Section 3 discusses the acquisition of data for modeling and simulation. Section 4 discusses the results and compares the outcomes of the present and existing models using different performance metrics. Section 5 presents the conclusion of the manuscript.

2. Formulations of the implemented algorithms

The mathematical description of the utilized and implemented computational algorithms is discussed. This section also presents mathematical background and description of the implemented optimization algorithm.

2.1. Mathematical background of support vector regression algorithm

SVR is a learning algorithm which aims at minimizing the distance between the established fitting equation and the processed data within certain defined error limitation that is controlled by insensitive loss zone (Balogun et al., Citation2021; Dodangeh et al., Citation2020). The non-linear input vectors are transformed to higher dimensional space using kernel function while the construction of hyper-plane takes place in the feature space and controlled by the lambda hyper-parameter. Consider a given training set of data $D = {(η_{1}, ϕ_{Δ 1}), (η_{2}, ϕ_{Δ 2}), \dots (η_{m}, ϕ_{Δ m})}$ where $η \in R^{m},$ $ϕ_{Δ} \in R$ and $j = 1, 2, \dots . m,$ in which $η_{j}$ represents the descriptors that include molecular solubility, highest occupied molecular orbital energy, dipole density, electrostatic, electronegativity and dipole while $ϕ_{Δ j}$ stands for the experimentally measured quantum yield singlet oxygen production of pteridines photosensitizer. The regression equation within SVR description is shown in EquationEquation (1)(1) $ϕ_{Δ}^{p} = ω^{T} η + b$ (1) . (1) $ϕ_{Δ}^{p} = ω^{T} η + b$ (1) where $ω$ the weight vector coefficients while $b$ is the bias part

SVR learning algorithm finds a function through precise approximation, in which the expression $ϕ_{Δ j}^{p} \approx ϕ_{Δ j}$ is satisfied, where $ϕ_{Δ}^{p}$ stands for the predicted quantum yield singlet oxygen production for pteridines photosensitizer. The algorithm aims at determining $ω$ and $b$ through solving the optimization problem which is governed by $ε$ -insensitive loss function that is capable to constrain the acquired expression $ϕ_{Δ}^{p}$ to error limit of $ε .$ This objective is achieved through minimization of Euclidean norm $\frac{{‖ ω ‖}^{2}}{2}$ subjected to the constraints presented in EquationEquation (2)(2) ${\begin{matrix} ϕ_{Δ j} - (ω^{T} η_{j} + b) \leq ε \\ (ω^{T} η_{j} + b) - ϕ_{Δ j} \leq ε \end{matrix}$ (2) (Owolabi, Citation2019a, Citation2019b; Rui et al., Citation2019; Murillo-Escobar et al., Citation2019; Ju et al., Citation2019). (2) ${\begin{matrix} ϕ_{Δ j} - (ω^{T} η_{j} + b) \leq ε \\ (ω^{T} η_{j} + b) - ϕ_{Δ j} \leq ε \end{matrix}$ (2)

In a real sense, every data-points $(η, ϕ_{Δ})$ in $D$ cannot be approximated within $ε$ error limit and can alter the feasibility of the optimization problem. This is addressed through incorporation of positive non-zero variables called slack variables $(ξ_{j}, ξ_{j}^{*}) .$ With this incorporation, modification of the convex optimization problem is contained in EquationEquation (3)(3) $Minimize \frac{{‖ ω ‖}^{2}}{2} + C \sum_{j = 1}^{m} (ξ_{j} + ξ_{j}^{*})$ (3) and the associated constraints are mathematically shown in EquationEquation (4)(4) ${\begin{array}{l} ϕ_{Δ j} - (ω^{T} η_{j} + b) \leq ε + ξ_{j} \\ (ω^{T} η_{j} + b) - ϕ_{Δ j} \leq ε + ξ_{j}^{*}, where ξ_{j}, ξ_{j}^{*} \geq 0 \end{array}$ (4) (Olubi et al., Citation2021). (3) $Minimize \frac{{‖ ω ‖}^{2}}{2} + C \sum_{j = 1}^{m} (ξ_{j} + ξ_{j}^{*})$ (3) (4) ${\begin{array}{l} ϕ_{Δ j} - (ω^{T} η_{j} + b) \leq ε + ξ_{j} \\ (ω^{T} η_{j} + b) - ϕ_{Δ j} \leq ε + ξ_{j}^{*}, where ξ_{j}, ξ_{j}^{*} \geq 0 \end{array}$ (4)

The parameter C in EquationEquation (3)(3) $Minimize \frac{{‖ ω ‖}^{2}}{2} + C \sum_{j = 1}^{m} (ξ_{j} + ξ_{j}^{*})$ (3) which must be greater than zero is called the regularization factor and functions in controlling the trade-off between the maximum allowable deviation and the complexity of the model. When the dual formulation is invoked with the implementation of Langrage multipliers ( $ψ_{j}$ $ψ_{j}^{*}$ ) which describe the algorithm optimization problem, the chosen kernel function $χ (η_{j}, η_{i})$ helps in returning pairwise dot product in high feature space without explicit data mapping. This results into expression shown in EquationEquation (5)(5) $max - \frac{1}{2} \sum_{j, i = 1}^{m} (ψ_{j} - ψ_{j}^{*}) (ψ_{i} - ψ_{i}^{*}) χ (η_{j}, η_{i}) - ε \sum_{j = 1}^{m} (ψ_{j} + ψ_{j}^{*}) + \sum_{j = 1}^{m} ϕ_{Δ j} (ψ_{j} - ψ_{j}^{*})$ (5) subject to the conditions depicted by EquationEquation (6)(6) ${\begin{matrix} \sum_{j = 1}^{m} (ψ_{j} - ψ_{j}^{*}) = 0 \\ ψ_{j}, ψ_{j}^{*} \in [0, C] \end{matrix}$ (6) (Owolabi, Citation2023) (5) $max - \frac{1}{2} \sum_{j, i = 1}^{m} (ψ_{j} - ψ_{j}^{*}) (ψ_{i} - ψ_{i}^{*}) χ (η_{j}, η_{i}) - ε \sum_{j = 1}^{m} (ψ_{j} + ψ_{j}^{*}) + \sum_{j = 1}^{m} ϕ_{Δ j} (ψ_{j} - ψ_{j}^{*})$ (5) (6) ${\begin{matrix} \sum_{j = 1}^{m} (ψ_{j} - ψ_{j}^{*}) = 0 \\ ψ_{j}, ψ_{j}^{*} \in [0, C] \end{matrix}$ (6)

The SVR approximated function in terms of the kernel and Lagrange multipliers is presented in EquationEquation (7)(7) $ϕ_{Δ}^{p} = \sum_{j = 1}^{m} (ψ_{j} - ψ_{j}^{*}) χ (η_{j}, η_{i}) + b$ (7) while best function that transforms the data for PQY singlet oxygen production prediction is presented in EquationEquation (8)(8) $χ (η_{j}, η_{i}) = exp (α {| η_{j} - η |}^{2})$ (8) . (7) $ϕ_{Δ}^{p} = \sum_{j = 1}^{m} (ψ_{j} - ψ_{j}^{*}) χ (η_{j}, η_{i}) + b$ (7) (8) $χ (η_{j}, η_{i}) = exp (α {| η_{j} - η |}^{2})$ (8) where $α$ represents the parameter of the kernel function also called kernel option.

The regularization factors C, kernel option $α$ and epsilon $ε$ are the SVR hyper-parameters that significantly influence the model performance. These parameters are well selected and evolutionarily optimized through optimization algorithm based on swarm movement.

2.2. Particle swarm optimization algorithm

PSO technique is a meta-heuristic population-based technique of searching for global solution with unique features of simplicity, convenience and fast convergence speed (Wang, Liang, et al., Citation2021; Wang, Li, et al., Citation2021; Zou, Citation2021; Beheshti, Citation2020). The operational working principle of the algorithm involves initialization of a definite population number of random solutions where each solution (referred as particle) is allocated a velocity with which the particle searches within N-dimensional search space. The present position of the particle is controlled using particle fitness while the particle always keeps the best position in its memory (Yildiz, Citation2019; Yıldız et al., Citation2019). Consider a swarm of particles occupying N-dimensional search space where $y_{k} (j) = [y_{k 1}, y_{k 2}, \dots, y_{kN}]$ represents the spatial position of $kth$ particle within the search space. Supposing $V_{k} (j) = [v_{k 1}, v_{k 2}, \dots v_{kN}]$ represents the velocity of the particles within the swarm while the best individual particle position is expressed as $r_{k} (j) = [r_{k 1}, r_{k 2}, \dots r_{kN}]$ and the best position of each of the particle is represented as $z_{k} (j) = [z_{k 1}, z_{k 2}, \dots z_{kN}] .$ EquationEquations (9)(9) $v_{k} (j) = λ v_{k} (j - 1) + c_{1} r and [r_{k} (j - 1) - y_{k} (j - 1)] + c_{2} r and [z_{k} (j - 1) - y_{k} (j - 1)]$ (9) and Equation(10)(10) $y_{k} (j) = v_{k} (j) + y_{k} (j - 1)$ (10) , respectively, update the velocity and the position of $kth$ particle as the particle navigates between $j - 1$ and $j$ generation (9) $v_{k} (j) = λ v_{k} (j - 1) + c_{1} r and [r_{k} (j - 1) - y_{k} (j - 1)] + c_{2} r and [z_{k} (j - 1) - y_{k} (j - 1)]$ (9) (10) $y_{k} (j) = v_{k} (j) + y_{k} (j - 1)$ (10) where $λ$ represents the inertial weight, $rand$ is the random number that spans within $[0, 1]$ interval which enhances the randomness of the search, $c_{1}$ and $c_{2}$ are the inertial factors which are non-negative acceleration parameters that influences the global and local learning rate (Owolabi, Citation2019a, Citation2019b). The particle number satisfies $k = 1, 2, \dots ., p,$ where p is the maximum number of particles in the swarm.

2.3. Extreme learning machine

ELM is a class of neural network machine learning algorithm with single-hidden layer feedforward networks (SLFNs) (Huang et al., Citation2006; Oyeneyin, Citation2021; Alqahtani et al., Citation2022). Notably, error optimization using gradient decent during back-propagation (BP) is quite a time and computationally intensive process. This usually causes significantly slow training time. Also, the models generated thereof are usually non-generalized as the initial values of defined weights are context and use-case specific (Oyeneyin, Citation2021). In ELM, the weights between the input and the hidden layer are assigned random values, typically by using a uniform probability distribution. They remain unchanged throughout the training process, thus eliminating the need for the iterative hidden layer adjustment usually associated with neural networks. The weights between the hidden and output layer can then be derived analytically using the Moore–Penrose pseudo-inverse. The basic structure of an ELM is given in EquationEquation (11)(11) $\sum_{i}^{h} ɸ_{i} g (w_{i} . x_{j} + b_{i}) = y_{j}$ (11) . (11) $\sum_{i}^{h} ɸ_{i} g (w_{i} . x_{j} + b_{i}) = y_{j}$ (11) where $ɸ_{i}$ is the output weight vector, $w_{i}$ is the input weight vector, $x_{j}$ and $y_{j}$ denote the ith input and output vector, $b_{i}$ is the bias of the ith hidden neuron and $g$ is the activation function of the ith neuron. EquationEquation (11)(11) $\sum_{i}^{h} ɸ_{i} g (w_{i} . x_{j} + b_{i}) = y_{j}$ (11) can be further expressed as a system of matrices as shown in EquationEquation (12)(12) $H . β = T$ (12) . (12) $H . β = T$ (12)

The matrix of the output weight $β$ can be found analytically by solving the system of matrices in EquationEquation (12)(12) $H . β = T$ (12) . ELM has demonstrated better performance over similar neural networks with its extremely fast training time and better generalization ability (Shamsah & Owolabi, Citation2020).

3. Computational methodology of the hybrid model

The data acquisition strategies and description of descriptors are presented in this section. Method of algorithm hybridization is also presented.

3.1. Dataset description and acquisition

PQY singlet oxygen production for twenty-nine pteridines photosensitizer was modeled using molecular descriptors which include molecular solubility, highest occupied molecular orbital energy, dipole density, electrostatic, electronegativity and dipole. The measured dataset used for the modeling and simulation was acquired and extracted from the literature (Buglak et al., Citation2016). Each of the considered descriptors has direct effect on PQY singlet oxygen production for pteridines photosensitizer. The molecular solubility is inversely related with PQY singlet oxygen production. This relationship can be well understood from the deactivation induced by charge-transfer that increases with polar solvent. Thereby, the charge-transfer exciplex gains no access to water molecules due to low solubility molecules characterizing the larger part of the substituents. This consequently lowers the rate at which these compounds quench the singlet oxygen. The interaction of singlet oxygen quenching and radiation-less energy transfer due to diplole-dipole explains the influence of diplo energy on PQY singlet oxygen production. The electron attraction tendency of molecule is measured by electronegativity. In a case of the quencher molecule becoming the electron donor in charge transfer interaction, the total rate constant of singlet oxygen quenching increases which subsequently decreases the electronegativity and increases the highest occupied molecular orbital energy. Preliminary outcomes of analysis (statistical) performed on the dataset of the investigated 29 pteridines photosensitizer are presented in .

Table 1. Statistical results of the investigated pteridine photosensitizer.

Download CSV Display Table

The dipole, electrostatic and dipole density are weakly correlated with the quantum yield of singlet oxygen production while the highest occupied molecular orbital shows negative correlation. The coefficient of correlation shows the extent and degree of linear connection between the descriptors and targets. The presented maximum and minimum values presented in table shows the range of the dataset contents while the consistency in measurement is measured by the presented standard deviations. The uniqueness and the relevancy of the chosen descriptors enhance the developed predictive models for precise measurement of quantum yield of singlet oxygen production.

3.2. Computational hybridization of PSO with SVR algorithm

Modeling of quantum yield of singlet oxygen of pteridines photosensitizer for photodynamic cancer therapy application was carried out using computing part of MATLAB. Randomization of the available descriptors (molecular solubility, highest occupied molecular orbital energy, dipole density, electrostatic, electronegativity and dipole) and their corresponding targets was initiated prior to dataset partitioning. The essence of randomization is to ensure even and uniform data-points distribution so that the developed model is not biased toward the training or testing phase. The twenty-nine pteridines compounds available for simulation were separated into training and testing set in the ratio of 8:2. The training set of data was employed for support vector acquisition while the testing data samples were utilized for validating the performance of the acquired support vectors during training phase. The step-by-step procedures of algorithm hybridization are itemized as follows:

Step I: Particle randomization and initialization: at $j = 0,$ k-number of particles were randomized uniformly and initialized $r_{kN} (0)$ within $[r_{max}, r_{min}]$ such that $r_{min} = - r_{max} .$ Initialization of the velocity take similar approach with uniform distribution of particle characterized with $v_{kN} (0)$ within $[v_{max}, v_{min}]$ range in which $v_{min} = - v_{max} .$ The maximum velocity in the given direction is expressed as $v_{k}_{max} = \frac{y_{k}_{max} - y_{k}_{min}}{i}$ with $i$ number of the interval chosen.
Step II: Fitness determination through objective function implementation: Each of the particle within the swarm was set at its current best position $z_{k}^{*} (0) = z_{k} (0)$ while the corresponding fitness function is represented as $F_{k}^{*} (0) = F_{k} (0) .$ The particle with best fitness $F_{best}$ (corresponding to lowest root mean square error [RMSE]) was set as global fitness with position $z_{k}^{* *} (0)$ while $F_{k}^{* *} (0) = F_{best} .$ The following steps were implemented while determining the fitness of each of the particle in the swarm.
Step A: Mapping function: A function was selected among several probable functions that can transform the input set of data to high dimensional space. The available functions include linear, sigmoid, Gaussian and polynomial.
Step B: Selection of a particle in a swarm for fitness computation: a particle among the swarm which houses the probable values of the hyper-parameter in the order of penalty factor, kernel parameter and epsilon, was selected. The selected particle together with the training set of data and the chosen kernel function were utilized for SVR training and the trained model was evaluated using RMSE-training. The acquired support vectors were saved for future use.
Step C: Validation of the fitness of the trained model: Using the testing data samples alone alongside with the acquired support vectors in Step B, the fitness of the trained model was validated using RMSE-testing.
Step D: Repeat Step B and Step C for the entire particle in the swarm and save the support vectors together with the values of the parameters that measure the fitness of each of the particle.
Step III: Iteration update: the subsequent iteration was updated through $j = j + 1$ implementation.
Step IV: Updating inertial weight $λ$ : The inertial weight is updated in accordance to $λ (j) = β λ (j - 1)$ where $β$ represents a constant which approaches unity in value.
Step V: Update in velocity and position: Velocity and position of each of the particle are updated regularly in every iteration until stopping condition is attained using EquationEquations (9)(9) $v_{k} (j) = λ v_{k} (j - 1) + c_{1} r and [r_{k} (j - 1) - y_{k} (j - 1)] + c_{2} r and [z_{k} (j - 1) - y_{k} (j - 1)]$ (9) and Equation(10)(10) $y_{k} (j) = v_{k} (j) + y_{k} (j - 1)$ (10) , respectively.
Step VI: Updating individual best position if necessary: Supposing $F_{k} (j) < F_{k}^{*} (j),$ the individual best position as well as the corresponding fitness is updated as $z_{k}^{*} (j) = z_{k} (j)$ and $F_{k}^{*} (j) = F_{k} (j) .$ If $F_{k} (j) < F_{k}^{*} (j)$ condition is not met, proceed without updating.
Step VII: Update the global best position if necessary: if $F_{min} (j) < F^{* *} (j)$ condition is met, then $F^{* *} (j) = F_{min}$ and $z^{* *} (j) = z_{min} (j) .$
Step VIII: Stopping condition: Check if $F_{min} = 0$ or $F_{min}$ attains the same value for fifty consecutive iterations. Otherwise, return to Step III.

4. Results and discussion

The outcome of the simulation carried out in this research work is presented. The results of the optimization as well the performance comparison between the present and existing models are also presented.

4.1. Convergence of the hybrid model at different swarm population

The variation of the convergence of the developed hybrid SVR-PSO model to the number of swarm exploring and exploiting the search space is shown in . For the error convergence presented in , different number of swarm population has different starting point of convergence, however, they all show similar global solution after several iterations. It indicates high level of robustness of the SVR-PSO model. The convergence of penalty/regularization hyper-parameter factor at various number of swarm population is presented in . The exploitation and exploration ability of the swarm within the defined search space are well investigated. The convergence of the regularization factor when 50 swarms explored the search space appears faster than the case when 100 and 500 swarms were exploiting and exploring the space. presents the convergence of kernel parameter for different size of swarm in the search space at various number of iteration. The model converges faster when the number of swarm in the population was set at 50 as compared to the cases when the number of swarm was maintained at 100 and 500. Similar convergence pattern is obtained for the variation of epsilon at different iteration stages as presented in .

Figure 1. Convergence of the developed SVR-PSO model at various iteration for different number of swarms (a) root mean square error; (b) regularization factor; (c) kernel parameter; (d) epsilon.

The obtained values of the optimum hyper-parameters as extracted from the PSO are presented in .

Table 2. Results of the optimization of SVR hyper-parameters using PSO.

Download CSV Display Table

4.2. Comparison between developed intelligent models

The performance of the developed three models is compared on the basis of R-squared at different stages of model development and presented in . During pattern acquisition, the developed PSO-SVR model demonstrated superior performance over the other two models (ELM-SINE and ELM-SIG) followed by ELM-SINE and ELM-SIG model. While implementing the acquired patterns for validation set of data, ELM-SIG model shows superior performance followed by ELM-SINE and PSO-SVR model. presents the values of each of the assessment parameters for different stages of model development. The assessment parameters include the correlation coefficient (CC), root mean square error (RMSE), mean absolute error (MAE) and R-squared.

Figure 2. Performance comparison between developed models using R-squared parameter (a) training stage and (b) testing stage.

Table 3. Performance assessment parameters for the developed models.

Download CSV Display Table

The weights associated with the developed ELM-based models are presented in . The model weights contain the randomly generated biases, computed output weights using Moore–Penrose pseudo-inverse approach and optimized input weights corresponding to input descriptors.

Table 4. Weights associated with the developed ELM-based models.

Download CSV Display Table

4.3. Performance of the developed predictive models and comparison with the existing QSAR (2016) model

The performances of the developed hybrid SVR-PSO and ELM-based models are evaluated using CC, R-squared, RMSE and the MAE and presented in . The values of CC, R-squared, RMSE and MAE associated with the developed SVR-PSO model are 0.9908, 0.9816, 0.1815 and 0.1715, respectively, while the computed assessment parameters for the developed ELM-SIG model are 0.9781, 0.9566,0.1978 and 0.1421, respectively. For the developed ELM-SINE model, the values of CC, R-squared, RMSE and MAE obtained are 0.9866, 0.9735, 0.1555 and 0.1137, respectively. Each of the developed models has higher values of CC and R-squared and lower values of errors (RMSE and MAE) as compared with the existing model. On the basis of comparison of the developed SVR-PSO with the existing QSAR (2016) (Buglak et al., Citation2016) model, performance improvement of 34.78%, 17.64%, 3.65% and 7.44% were, respectively, obtained on the basis of RMSE, MAE, CC and R-squared performance metrics.

Table 5. Comparison between the models and the performance improvement of SVR-PSO over QSAR (2016) (Buglak et al., Citation2016).

Download CSV Display Table

Comparison of the developed ELM-SIG model with the existing QSAR (2016) model (Buglak et al., Citation2016) shows the superiority of the developed ELM-SIG model with enhancement of 31.78%, 28.93%, 2.32% and 4.70% using MAE,RMSE, CC and R-squared performance assessment, respectively. Similarly, the developed ELM-SINE model outperforms the existing QSAR (2016) model (Buglak et al., Citation2016) model with improvement of 45.43%, 44.12%, 3.22% and 6.54% using performance metrics of MAE, RMSE, CC and R-squared, respectively. Cross-plot correlation between the estimated and measured quantum yield of singlet oxygen production is presented in for the present and existing models.

Figure 3. Correlation cross-plot between the measured and estimated quantum yield of singlet oxygen production.

The data-points for the investigated pteridine photosensitizer estimated by the developed SVR-PSO model, ELM-SIG model and ELM-SINE model are well aligned with the measured values while the deviation in the estimates of QSAR (2016) model (Buglak et al., Citation2016) can be inferred from its non-aligned data-points.

5. Conclusion

The quantum yield of singlet oxygen production for 29 different compounds of pteridine photosensitizer is modeled using ELM and hybridization of SVR with PSO algorithm. The predictors to the developed model include molecular solubility, highest occupied molecular orbital energy, dipole density, electrostatic, electronegativity and dipole while the developed model was validated using experimentally measured quantum yield of singlet oxygen production. The performance of the developed SVR-PSO model is compared with the existing QSAR (2016) model in the literature on the basis of RMSE, CC, MAE and MAPE and the hybrid model shows superior performance over the existing model. Comparison of the developed ELM-SIG model with the existing QSAR (2016) model shows the superiority of the developed ELM-SIG model with enhancement of 31.78%, 28.93%, 2.32% and 4.70% using MAE, RMSE, CC and R-squared performance assessment, respectively. Similarly, the developed ELM-SINE model outperforms the existing QSAR (2016) model with improvement of 45.43%, 44.12%, 3.22% and 6.54% using performance metrics of MAE, RMSE, CC and R-squared, respectively. The accuracy, robustness and outstanding performance of the developed predictive models would definitely promote quick, easy and less costly method of determining quantum yield of singlet oxygen production for photodynamic cancer therapy treatment. The developed model is limited to pteridine PQY of singlet oxygen production.

Disclosure statement

Author has no competing interest.

Data availability statement

The raw data needed to reproduce these findings are cited in Section 3.1 of the manuscript.

Additional information

Notes on contributors

Nahier Aldhafferi

Dr. Nahier Aldhafferi is an associate professor at the Department of Compteure Information Systems, College of Computer Science and Information Technology, Imam Abdurrahman Bin Faisal University in Saudi Arabia. He received his BS in 2005 from Dammam Teachers College, Saudi Arabia. He also received his master’s degree in 2009 in Internet Technology from Wollongong University, Australia. Currently, he has a PhD in Information Technology from New England University, Australia. He has published alot of research in Data Science and Information Privacy and Artificial Intelligence. His current research interests are in the fields of Data Science, Data Mining, E-gov Services Integration, Privacy, Machine Learning, and Software Engineering. He can be contaced at email: [email protected]

References

Aboelkassem, Y., & Savic, D. (2021). Particle swarm optimizer for arterial blood flow models. Computer Methods and Programs in Biomedicine, 201, 105933. https://doi.org/10.1016/j.cmpb.2021.105933
PubMed Web of Science ®Google Scholar
Akomolafe, O., Owolabi, T. O., Rahman, M. A. A., Kechik, M. M. A., Yasin, M. N. M., & Souiyah, M. (2021). Modeling superconducting critical temperature of 122-iron-based pnictide intermetallic superconductor using a hybrid intelligent computational method. Materials (Basel, Switzerland), 14(16), 4604. https://doi.org/10.3390/ma14164604
PubMed Web of Science ®Google Scholar
Alqahtani, A., Saliu, S., Owolabi, T. O., Aldhafferi, N., Almurayh, A., & Oyeneyin, O. E. (2022). Modeling the magnetic cooling efficiency of spinel ferrite magnetocaloric compounds for magnetic refrigeration application using hybrid intelligent computational methods. Materials Today Communications, 33, 104310. https://doi.org/10.1016/j.mtcomm.2022.104310
Web of Science ®Google Scholar
Balogun, A. L., Rezaie, F., Pham, Q. B., Gigović, L., Drobnjak, S., Aina, Y. A., Panahi, M., Yekeen, S. T., & Lee, S. (2021). Spatial prediction of landslide susceptibility in western Serbia using hybrid support vector regression (SVR) with GWO, BAT and COA algorithms. Geoscience Frontiers, 12(3), 101104. https://doi.org/10.1016/j.gsf.2020.10.009
Web of Science ®Google Scholar
Basak, D. S. P., & Partababis, D. C. (2007). Support vector regression. Neural Information Processing, 11(10), 203–224. https://static.aminer.org/pdf/PDF/000/337/560/uncertainty_support_vector_method_for_ordinal_regression.pdf
Google Scholar
Beheshti, Z. (2020). A time-varying mirrore d S-shape d transfer function for binary particle swarm optimization. Information Sciences, 512, 1503–1542. https://doi.org/10.1016/j.ins.2019.10.029
Web of Science ®Google Scholar
Buglak, A. A., Telegina, T. A., & Kritsky, M. S. (2016). A quantitative structure – property relationship (QSPR) study of singlet oxygen generation by pteridines. Photochemical & Photobiological Sciences, 15(6), 801–811. https://doi.org/10.1039/c6pp00084c
PubMed Web of Science ®Google Scholar
Dantola, M. L., Reid, L. O., Castaño, C., Lorente, C., Oliveros, E., & Thomas, A. H. (2017). Photosensitization of peptides and proteins by pterin derivatives. Pteridines, 28(3–4), 105–114. https://doi.org/10.1515/pterid-2017-0013
Web of Science ®Google Scholar
Derosa, M. C., & Crutchley, R. J. (2002). Photosensitized singlet oxygen and its applications. Co-Ordination Chemistry Review, 234, 351–371. https://doi.org/10.1016/S0010-8545(02)00034-6
Google Scholar
Dodangeh, E., Panahi, M., Rezaie, F., Lee, S., Tien Bui, D., Lee, C. W., & Pradhan, B. (2020). Novel hybrid intelligence models for flood-susceptibility prediction: Meta optimization of the GMDH and SVR models with the genetic algorithm and harmony search. Journal of Hydrology, 590, 125423. https://doi.org/10.1016/j.jhydrol.2020.125423
Web of Science ®Google Scholar
Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A. J., & Vapnik, V. N. (1996). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161. https://proceedings.neurips.cc/paper_files/paper/1996/hash/d38901788c533e8286cb6400b40b386d-Abstract.html
Google Scholar
Frimayanti, N., Yam, M. L., Lee, H. B., Othman, R., Zain, S. M., & Rahman, N. A. (2011). Validation of quantitative structure-activity relationship (QSAR) model for photosensitizer activity prediction. International Journal of Molecular Sciences, 12(12), 8626–8644. https://doi.org/10.3390/ijms12128626
PubMed Web of Science ®Google Scholar
Huang, G. B., Zhu, Q. Y., & Siew, C. K. (2006). Extreme learning machine: Theory and applications. Neurocomputing, 70(1–3), 489–501. https://doi.org/10.1016/j.neucom.2005.12.126
Web of Science ®Google Scholar
Ju, X., Liu, F., Wang, L., & Lee, W. J. (2019). Wind farm layout optimization based on support vector regression guided genetic algorithm with consideration of participation among landowners. Energy Conversion and Management, 196, 1267–1281. https://doi.org/10.1016/j.enconman.2019.06.082
Web of Science ®Google Scholar
Li, S., Fang, H., & Shi, B. (2021). Remaining useful life estimation of Lithium-ion battery based on interacting multiple model particle filter and support vector regression ✩. Reliability Engineering & System Safety, 210, 107542. https://doi.org/10.1016/j.ress.2021.107542
Web of Science ®Google Scholar
Lin, A. L., Chen, J. H., Hong, J. W., Zhao, Y. Y., Zheng, B. Y., Ke, M. R., & Huang, J. D. (2021). A phthalocyanine-based self-assembled nanophotosensitizer for efficient in vivo photodynamic anticancer therapy. Journal of Inorganic Biochemistry, 217, 111371. https://doi.org/10.1016/j.jinorgbio.2021.111371
PubMed Web of Science ®Google Scholar
Mfouo-Tynga, I. S., Dias, L. D., Inada, N. M., & Kurachi, C. (2021). Photodiagnosis and photodynamic therapy features of third generation photosensitizers used in anticancer photodynamic therapy: Review. Photodiagnosis and Photodynamic Therapy, 34, 102091. https://doi.org/10.1016/j.pdpdt.2020.102091
PubMed Web of Science ®Google Scholar
Murillo-Escobar, J., Sepulveda-Suescun, J. P., Correa, M. A., & Orrego-Metaute, D. (2019). Urban climate forecasting concentrations of air pollutants using support vector regression improved with particle swarm optimization: Case study in Aburrá Valley, Colombia. Urban Climate, 29, 100473. https://doi.org/10.1016/j.uclim.2019.100473
Web of Science ®Google Scholar
Oliveira, C. S., Turchiello, R., Kowaltowski, A. J., Indig, G. L., & Baptista, M. S. (2011). Free radical biology & medicine major determinants of photoinduced cell death: Subcellular localization versus photosensitization efficiency. Free Radical Biology and Medicine, 51(4), 824–833. https://doi.org/10.1016/j.freeradbiomed.2011.05.023
PubMed Web of Science ®Google Scholar
Olubi, O. E., Oniya, E. O., & Owolabi, T. O. (2021). Development of predictive model for radon-222 estimation in the atmosphere using stepwise regression and grid search based-random forest regression. Journal of the Nigerian Society of Physical Sciences, 3, 132–139. https://doi.org/10.46481/jnsps.2021.177
Google Scholar
Owolabi, T. O. (2019a). Development of a particle swarm optimization based support vector regression model for titanium dioxide band gap characterization. Journal of Semiconductors, 40(2), 022803. https://doi.org/10.1088/1674-4926/40/2/022803
Web of Science ®Google Scholar
Owolabi, T. O. (2019b). Modeling the magnetocaloric effect of manganite using hybrid genetic and support vector regression algorithms. Physics Letters A, 383(15), 1782–1790. https://doi.org/10.1016/j.physleta.2019.02.036
Web of Science ®Google Scholar
Owolabi, T. O. (2023). Modeling magnetocaloric effect of doped EuTiO 3 perovskite for cooling technology using swarm intelligent based support vector regression computational method. Materials Today Communications, 36, 106688. https://doi.org/10.1016/j.mtcomm.2023.106688
Web of Science ®Google Scholar
Owolabi, T. O., Amiruddin, M., & Rahman, A. (2021). Energy band gap modeling of doped bismuth ferrite multifunctional material using gravitational search algorithm optimized support vector regression. Crystal, 11(3), 246. https://doi.org/10.3390/cryst11030246
Web of Science ®Google Scholar
Oyeneyin, O. E. (2021). Predicting the bioactivity of 2-alkoxycarbonylallyl esters as potential antiproliferative agents against pancreatic cancer (MiaPaCa-2) cell lines: GFA-based QSAR and ELM-based models with molecular docking. Journal of Genetic Engineering and Biotechnology, 19, 38. https://doi.org/10.1186/s43141-021-00133-2
PubMed Web of Science ®Google Scholar
Rui, J., Zhang, H., Zhang, D., Han, F., & Guo, Q. (2019). Journal of Petroleum Science and Engineering Total organic carbon content prediction based on support-vector-regression machine with particle swarm optimization. Journal of Petroleum Science and Engineering, 180, 699–706. https://doi.org/10.1016/j.petrol.201.06.014
Web of Science ®Google Scholar
Science, N., Phenomena, C., Sabzekar, M., Mohammad, S., & Hasheminejad, H. (2021). Chaos, Solitons and Fractals. Journal of Nonlinear Science, and Nonequilibrium and Complex Phenomena, 144, 110738. https://doi.org/10.1016/j.chaos.2021.110738
Google Scholar
Shamsah, S. M. I., & Owolabi, T. O. (2020). Modeling the maximum magnetic entropy change of doped manganite using a grid search-based extreme learning machine and hybrid gravitational search-based support vector regression. Crystals, 10(4), 310. https://doi.org/10.3390/cryst10040310
Web of Science ®Google Scholar
Shi, G., Monro, S., Hennigar, R., Colpitts, J., Fong, J., Kasimova, K., Yin, H., DeCoste, R., Spencer, C., Chamberlain, L., Mandel, A., Lilge, L., & McFarland, S. A. (2015). Ru (II) dyads derived from ˛ -oligothiophenes: A new class of potent and versatile photosensitizers for PDT. Coordination Chemistry Reviews, 282–283, 127–138. https://doi.org/10.1016/j.ccr.2014.04.012
Web of Science ®Google Scholar
Wang, C., Liang, C., Hao, Y., Dong, Z., Zhu, Y., Li, Q., Liu, Z., Feng, L., & Chen, M. (2021). Photodynamic creation of artificial tumor microenvironments to collectively facilitate hypoxia-activated chemotherapy delivered by coagulation-targeting liposomes. Chemical Engineering Journal and the Biochemical Engineering Journal, 414, 128731. https://doi.org/10.1016/j.cej.2021.128731
Google Scholar
Wang, Y., Li, R., & Chen, Y. (2021). Accurate elemental analysis of alloy samples with high repetition rate laser-ablation spark-induced breakdown spectroscopy coupled with particle swarm optimization-extreme learning machine. Spectrochim Acta Part B Spectroscopy, 177, 106077. https://doi.org/10.1016/j.sab.2021.106077
Web of Science ®Google Scholar
Wu, J., & Wang, Y. (2020). A working likelihood approach to support vector regression with a data-driven insensitivity parameter. The International Journal of Machine Learning and Cybermetrics, 14, 1–18. https://link.springer.com/article/10.1007/s13042-022-01672-x
Web of Science ®Google Scholar
Yildiz, A. R. (2019). A novel hybrid whale–Nelder–Mead algorithm for optimization of design and manufacturing problems. The International Journal of Advanced Manufacturing Technology, 105(12), 5091–5104. https://doi.org/10.1007/s00170-019-04532-1
Web of Science ®Google Scholar
Yıldız, A. R., Yıldız, B. S., Sait, S. M., Bureerat, S., & Pholdee, N. (2019). A new hybrid Harris hawks-Nelder-Mead optimization algorithm for solving design and manufacturing problems. Materials Testing, 61(8), 735–743. https://doi.org/10.3139/120.111378
Web of Science ®Google Scholar
Zou, L. (2021). Microprocessors and Microsystems Design of reactive power optimization control for electromechanical system based on fuzzy particle swarm optimization algorithm. Microprocessors and Microsystems, 82, 103865. https://doi.org/10.1016/j.micpro.2021.103865
Google Scholar

Alternative cancer therapy through modeling pteridines photosensitizer quantum yield singlet oxygen production using swarm-based support vector regression and extreme learning machine

Abstract

1. Introduction

2. Formulations of the implemented algorithms

2.1. Mathematical background of support vector regression algorithm