Full article: A hybridization of MODWT-SVR-DE model emphasizing on noise reduction and optimal parameter selection for prediction of CO2 emission in Thailand

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

In present, Thailand is a major manufacturing base in Southeast Asia, which tremendously produces many sources of greenhouse gases (GHGs). The emission CO₂ become a spotlight issue since Thailand legislated for controlling industrial carbon footprint, which resulted in carbon credit charged. In order to control and monitor those effects of GHGs, future accuracy of CO₂ emission and CO₂ emission equivalence play a significant role and can support critical making on suitable control and monitoring. In this research, a hybridization of MODWT-SVR-DE model is developed and proposed that employs maximal overlap discrete wavelet transform (MODWT) with first decomposition to reduce fluctuation of CO₂ emission and CO₂ emission equivalence. Afterward, the support vector regression is used to formulate complex model. Meanwhile, differential evolution is used to search given parameters of support vector regression. Moreover, the proposed model is compared to conventional forecasting models (i.e. ARIMA, Holt and simple exponential smoothing [SES]) and hybrid model of support vector regression and differential evolution. The empirical results indicated that the proposed model outperforms all candidate models and provides significant difference than candidate models at 0.05 significance levels. Consequently, the proposed model is able to be accurately applied in order to monitor the environment and support policy planners to take step with the helpful guideline.

Keywords:

Reviewing Editor:

Subjects:

1. Introduction

Nowadays, all people around the world are aware and concerned about climate change (Franco et al., Citation2022; Lu et al., Citation2022; Kim & Jang, Citation2022; Sovacool et al., Citation2021), which is caused from greenhouse gas (GHG) in forms of CO₂ emission and CO₂ emission equivalence (Raihan & Voumik, Citation2022; Batool et al., Citation2022; Raihan & Tuspekova, Citation2022; Nong et al., Citation2021; Yoro & Daramola, Citation2020). Consequently, all forms of CO₂ emission and CO₂ emission equivalence are limited and forced by law. For Thailand (Lu et al., Citation2022; Wongsapai & Daroon, Citation2021; Srikaummun et al., Citation2021; Jaiboon et al., Citation2021), many industrial estates have been rapidly growing, which tremendously produce a lot of GHG. However, amount of CO₂ emission and CO₂ emission equivalence in a form of GHG is difficult to timely measure. Moreover, the amount of GHG in a form of CO₂ emission and CO₂ emission equivalence is uncertainty in each period.

In order to monitor and control risks of greenhouse effect, future amount of GHG in a form of CO₂ emission and CO₂ emission equivalence is need to realize before making critical decision on reduce risks of greenhouse effect (Franco et al., Citation2022).

With the aim of extrapolating amount of CO₂ emission and CO₂ emission equivalence in a form of GHG in advance, many forecasting techniques have been developed and proposed to project future amount of CO₂ emission and CO₂ emission equivalence in recent years. One of the most popular techniques is univariate time series analysis, which is formulated from only previous observations in order to form the prediction function. Therefore, the developed technique is a convenient tool dealing with predicting CO₂ emission and CO₂ emission equivalence in situations of reality. The summary list of abbreviations and acronyms used in this research is presented in .

Table 1. List of abbreviations and acronyms is used in this research.

Download CSV Display Table

In present, several conventional approaches of time series are still widely used to predict future amount of CO₂ emission and CO₂ emission equivalence. For instances, autoregressive integrated moving average (ARIMA) approaches (Cahyono et al., Citation2022; Javanmard & Ghaderi, Citation2022; Kumari & Singh, Citation2022; Xu et al., Citation2021; Ning et al., Citation2021; Leerbeck et al., Citation2020) provide well performance in linear forecasting problems while exponential smoothing without more complex approaches can provide robust of forecast in many literatures of science (Franco et al., Citation2022; Kumari & Singh, Citation2022). However, individual approach may not be sufficient to provide well performance in all situations of reality.

For being in need of more accuracy in present, many combined approaches are developed and proposed to achieve those situations. The combined approaches take advantage of a forecasting technique to reduce disadvantage of one another. A combined approach emphasizing on optimal parameter selection is a well-established approach and is easy to find in literatures surrounding many problems of time series forecasting. One of the powerful hybridizations is a combined approach of support vector regression and differential evolution, namely SVR-DE model, which utilizes support vector regression to build complex and adjustable functions to describe time series pattern whether linear or nonlinear function (Li et al., Citation2022; Ehteram et al., Citation2021; Ahmadi et al., Citation2019). The advantages of nonlinearity and complexity can provide better performance than that of conventional approaches of statistical technique (e.g. exponential smoothing and ARIMA) in real-world complex problems. Since statistical techniques are based on linear formulation and many prior assumptions, then these approaches are not adjustable to manipulate model and cannot also explain nonlinear relation of time series. Furthermore, support vector regression is formulated from principle of structural risk minimization, which provides global optimal solution due to convex problem. Although support vector regression performs well in forecasting, its performance depends heavily on appropriate parameter selection (Javanmard & Ghaderi, Citation2022; Adnan et al., Citation2022; Ehteram et al., Citation2021; Ghazvini et al., Citation2020). Consequently, the differential evolution (Hou et al., Citation2018; Hu & Chen, Citation2018; Zhang et al., Citation2016), a recent evolution algorithm as effective global optimization algorithm that has many advantages with simplicity, robustness, and reliability of implementation, is used to search for the most proper parameters of support vector regression within given search space. In addition, the combined approach of support vector regression and differential evolution can achieve more accuracy than traditional forecasting approaches. However, the combined approach may not be efficient and successful in all situations of actuality with regard to many noise problems.

In this regard, a combined approach including data pre-processing is introduced to address those problems, which emphasizes on preliminary process on datasets by decomposing time series into more stationary and regular subseries. The subseries are generally clearer to analyze with removing the irrelevant and redundant components of time series. Although data pre-processing techniques are a useful tool, model manipulations in each subseries are quite difficult as well. Therefore, this combined approach uses benefit of data pre-processing techniques emphasizing only on first level of decomposition to overcome the limitation of support vector regression concerning noise problems in time series analysis. One of the several data pre-processing techniques is maximal overlap discrete wavelet transform (MODWT), which is a well-known technique of modified version of discrete wavelet transform due to capability to address the circular shift effect (Yang et al., Citation2022; Shams et al., Citation2021; Seo et al., Citation2017).

In this research, a hybridization of MODWT-SVR-DE model is developed and proposed to forecast amount of CO₂ emission source of several sectors. With the intention of covering both perspectives, data pre-processing and optimal parameter selection techniques are developed and proposed to achieve more accuracy and precision of forecast. The proposed model is evaluated its performance of forecasting based on five accuracy measures and is compared to conventional models that are ARIMA, Holt’s model and simple exponential smoothing (SES). Moreover, the proposed model is compares to hybridization of SVR-DE model based on five accuracy measures as well.

For accuracy measures, both scale-dependent and scale-independent metrics are mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), symmetric mean absolute error (sMAPE) and root mean square percentage error (RMSPE). Those measures are still preferred to use in many literatures. Furthermore, scale-independent metrics (e.g. MAPE, sMAPE and RMSPE) are adopted with Friedman test and post hoc test to identify significant difference between forecasting performances.

2. Material and methodologies

2.1. Datasets of CO₂ emission and CO₂ emission equivalence

The datasets of CO₂ emission and CO₂ emission equivalence are categorized into several sectors and are secondary data of open online database from Climate Watch (Citation2022). Washington, D.C.: World Resource Institute based on data source of Climate Analysis Indicators Tool (CAIT), which are time series datasets from 1990 to 2019. The summary of datasets is demonstrated in .

Figure 1. Summary of CO₂ emission and CO₂ emission equivalence datasets.

Figure 1. Summary of CO2 emission and CO2 emission equivalence datasets.

2.2. Simple exponential smoothing

The simplest form of exponential smoothing is SES, which is a weighted average of previous observations with exponentially decreasing weights over time from present observation. The SES is suitable for time series dataset with no clear trend or seasonal pattern. The mathematical formulation is demonstrated as EquationEquations (1)(1) $l_{t - 1} = α y_{t - 1} + (1 - α) l_{t - 2}$ (1) and Equation(2)(2) $y_{t} = l_{t - 1} + ε_{t}$ (2) . (1) $l_{t - 1} = α y_{t - 1} + (1 - α) l_{t - 2}$ (1) (2) $y_{t} = l_{t - 1} + ε_{t}$ (2) where $y_{t}$ and $ε_{t}$ are actual observation and error at time t, respectively. $l_{t}$ is the level or smoothed value of time series at time t. $α$ is smoothing parameter between 0 and 1.

For establishing appropriate model of SES, the fitted model is evaluated by using SES function of R programing language (Hyndman et al., Citation2022) based on criterion of the lowest mean square error to provide proper smoothing parameter and initial estimated observation of first observation of time series.

2.3. Holt’s model

With regard to trend pattern using exponential smoothing, Holt’s model is a form of exponential smoothing with attracting equation of trend to the SES. Hence, the mathematical form is able to describe dataset with trend and is demonstrated as EquationEquations (3)–(5). (3) $l_{t - 1} = α y_{t - 1} + (1 - α) (l_{t - 2} + b_{t - 2})$ (3) (4) $b_{t - 1} = β (l_{t - 1} - l_{t - 2}) + (1 - β) b_{t - 2}$ (4) (5) $y_{t} = l_{t - 1} + b_{t - 1} + ε_{t}$ (5) where $y_{t}$ and $ε_{t}$ are actual observation and error at time t. The $l_{t}$ and $b_{t}$ are the estimate of level and trend of the time series at time t, respectively. $α$ and $β$ are smoothing parameter for level and trend between 0 and 1, respectively.

A holt function of R programming (Hyndman et al., Citation2022) is used to formulate the fitted model based on the lowest mean square error and provides suitable smoothing parameters and also estimated values of level and trend.

2.4. Autoregressive integrated moving average

ARIMA is well-known approach for linear forecasting problems, which is the general form of autoregressive moving average (ARMA) with differencing term to transform non-stationary time series to stationary series in order to estimate parameters of ARIMA model. The mathematical approach with mean $μ$ is described as EquationEquation (6)(6) $(1 - \sum_{i = 1}^{p} ϕ_{i} B^{i}) {(1 - B)}^{d} (y_{t} - μ) = (1 - \sum_{j = 1}^{q} θ_{j} B^{j}) ε_{t}$ (6) . (6) $(1 - \sum_{i = 1}^{p} ϕ_{i} B^{i}) {(1 - B)}^{d} (y_{t} - μ) = (1 - \sum_{j = 1}^{q} θ_{j} B^{j}) ε_{t}$ (6) where $y_{t}$ and $ε_{t}$ is the actual observation and error at time t, respectively. $p$ and $q$ are the order of autoregressive part and moving average part, respectively. $d$ is the order of differencing part, which is integer as well.

Since several ARIMA models can mimic a time series dataset, then the most suitable ARIMA model is chosen from the lowest Akaike information criterion for small sample size. The fitted model is received from auto.arima function of R programming language (Hyndman & Khandakar, Citation2008).

2.5. Support vector regression

The support vector regression is an extended form of support vector machine to deal with regression problems, which mathematical expression is presented as EquationEquation (7)(7) $f (x) = \sum_{i = 1}^{T} (α_{i} - α_{i}^{*}) K (x, x_{i}) + b$ (7) . (7) $f (x) = \sum_{i = 1}^{T} (α_{i} - α_{i}^{*}) K (x, x_{i}) + b$ (7) where $α_{i}$ and $α_{i}^{*}$ are the so-called Lagrange multipliers. $(\cdot, \cdot)$ denotes vector inner product. $(\cdot, \cdot)$ is a scalar threshold; $K (x, x_{i})$ is kernel function.

In this research, the types of kernel function satisfying Mercer’s condition are defined as EquationEquations (8)(8) $Linear : K (x, x_{i}) = x^{T} x_{i}$ (8) and Equation(9)(9) $Radial basis : K (x, x_{i}) = exp (- γ {‖ x - x_{i} ‖}^{2})$ (9) . (8) $Linear : K (x, x_{i}) = x^{T} x_{i}$ (8) (9) $Radial basis : K (x, x_{i}) = exp (- γ {‖ x - x_{i} ‖}^{2})$ (9)

2.6. Combined approach of support vector regression and differential evolution

The combined approach is developed with selecting proper parameters of support vector regression, which are input lag and also hyperplane based on advantage of differential evolution to investigate the appropriate parameters within given search space In other word, the combined approach can reduce a risk of using improper parameters of support vector regression and can reduce computational time consumption to model prediction function. The algorithm of the SVR-DE model is demonstrated as follows:

2.6.1. The differential evolution generates initial parameters of support vector regression dealing with input lag and hyperplane as described in EquationEquation (10)(10) $θ_{i, G} = [θ_{1, i, G}, θ_{2, i, G}, θ_{3, i, G}, \dots, θ_{d, i, G}] i = 1, 2, 3, \dots, N$ (10) . (10) $θ_{i, G} = [θ_{1, i, G}, θ_{2, i, G}, θ_{3, i, G}, \dots, θ_{d, i, G}] i = 1, 2, 3, \dots, N$ (10) where $θ$ is a vector of parameters of support vector regression, $d$ is dimensional parameters of support vector regression, $N$ is the size of population and $G$ is the number of generation.

2.6.2. The support vector regression utilizes the given input lag, which is gained from differential evolution to build prediction function by rearranging historical observations into m columns of the historical observations as following. $[\begin{matrix} y_{1} & y_{2} & y_{3} & \dots & y_{m} \\ y_{2} & y_{3} & y_{4} & \dots & y_{m + 1} \\ y_{3} & y_{4} & y_{5} & \dots & y_{m + 2} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ y_{t - m + 1} & \dots & y_{t - 2} & y_{t - 1} & y_{t} \end{matrix}]$

The first m − 1 columns of the historical observations are used as input lag, while the last column of the historical observations is uses as target data.

The prediction function of support vector regression is formulated by using a set of hyperplane in the given vector of parameters as EquationEquation (11)(11) ${\hat{y}}_{t} = f (y_{t - 1}, y_{t - 2}, y_{t - 3}, \dots, y_{t - m + 1} | θ)$ (11) (11) ${\hat{y}}_{t} = f (y_{t - 1}, y_{t - 2}, y_{t - 3}, \dots, y_{t - m + 1} | θ)$ (11) $f$ is prediction function of support vector regression, and ${\hat{y}}_{t}$ is predicted value at time t.

2.6.3. Forecast one step ahead to be future amount of CO₂ emission or CO₂ emission equivalence as described in EquationEquation (12)(12) $y_{t + 1} = {\hat{y}}_{t + 1} + ε_{t + 1}$ (12) . (12) $y_{t + 1} = {\hat{y}}_{t + 1} + ε_{t + 1}$ (12) where $y_{t + 1},$ ${\hat{y}}_{t + 1}$ and $ε_{t + 1}$ are the actual observation, forecast value and error at time t + 1, respectively.

2.6.4. Compare forecast value to the actual value in test data in order to compute MAPE.

2.6.5. An initial mutant vector of parameters $v_{i, G}$ is produced by choosing randomly three members of the population, $θ_{r_{0}, G},$ $θ_{r_{1}, G},$ and $θ_{r_{3}, G}$ as EquationEquation (13)(13) $v_{i, G} = θ_{i, G} + (θ_{best, G} - θ_{i, G}) + θ_{r_{0}, G} + 0.8 \times (θ_{r_{1}, G} - θ_{r_{2}, G})$ (13) . (13) $v_{i, G} = θ_{i, G} + (θ_{best, G} - θ_{i, G}) + θ_{r_{0}, G} + 0.8 \times (θ_{r_{1}, G} - θ_{r_{2}, G})$ (13) where $θ_{i, G}$ and $θ_{best, G}$ are the ith member of the population at current generation and the best member with the best fitness, respectively. $r_{0},$ $r_{1}$ and $r_{2}$ are randomly selected numbers within the population size. G is the number of generations; and $i = 1, 2, 3, \dots, N .$

2.6.6. The crossover operation produces a trial vector $u_{i, G}$ as EquationEquation (14)(14) $u_{i, G} = {\begin{cases} v_{j, i, G} if ran d_{j, i} \leq 0.5 or j = I_{rand} \\ θ_{j, i, G} otherwise \end{cases}$ (14) . (14) $u_{i, G} = {\begin{cases} v_{j, i, G} if ran d_{j, i} \leq 0.5 or j = I_{rand} \\ θ_{j, i, G} otherwise \end{cases}$ (14) where G is the number of generations; $i = 1, 2, 3, \dots, N; j = 1, 2, 3, \dots, d,$ $ran d_{j, i} \sim U [0, 1],$ $I_{rand}$ is a random integer from $[1, 2, \dots, d] .$

2.6.7. The differential evolution employs a greedy selection operator as EquationEquation (15)(15) $θ_{i, G + 1} = {\begin{cases} u_{i, G} if f (u_{i, G}) \leq f (θ_{i, G}) \\ θ_{i, G} otherwise \end{cases}$ (15) . (15) $θ_{i, G + 1} = {\begin{cases} u_{i, G} if f (u_{i, G}) \leq f (θ_{i, G}) \\ θ_{i, G} otherwise \end{cases}$ (15) where $f (θ_{i, G})$ is equal to MAPE of the target vector. $f (u_{i, G})$ is the MAPE of the trial vector. G is the number of generations and $i = 1, 2, 3, \dots, N$

2.6.8. Repeat 2.6.2–2.6.7 until the criterion is met.

Finally, the combined approach of support vector regression and differential evolution returns the optimal parameters of support vector regression.

2.7. Combined approach of proposed model

The proposed model is a complex combined approach emphasizing on both noise reduction and optimal parameter selection, which aims at utilizing MODWT to reduce noise from time series based on first decomposition into scaling and wavelet coefficient series. Afterward, the support vector regression formulates complex function to describe patterns concerning scaling coefficients and wavelet coefficients. In the meantime, differential evolution algorithm is utilized to search appropriate parameters of support vector regression. The algorithm of the proposed model is presented as follows:

2.7.1. The differential evolution creates initial parameters of support vector regression, which are input lag of scaling coefficients, input lag of wavelet coefficients, hyperplane of scaling coefficients, and hyperplane of wavelet coefficients as demonstrated in EquationEquation (16)(16) $θ_{i, G} = [θ_{1, i, G}, θ_{2, i, G}, θ_{3, i, G}, \dots, θ_{d, i, G}] i = 1, 2, 3, \dots, N$ (16) . (16) $θ_{i, G} = [θ_{1, i, G}, θ_{2, i, G}, θ_{3, i, G}, \dots, θ_{d, i, G}] i = 1, 2, 3, \dots, N$ (16) where $θ$ is a vector of parameters of support vector regression, $d$ is dimensional parameters of support vector regression, $N$ is the size of population and $G$ is the number of generation.

2.7.2. The Original time series is decomposed into scaling and wavelet coefficient series by using EquationEquations (17)–(19). (17) ${\tilde{V}}_{j, t} = \sum_{l = 0}^{L - 1} {\tilde{g}}_{l} {\tilde{V}}_{j - 1, t - 2^{j - 1} l mod N}, t = 0, 1, 2, 3, \dots, N - 1$ (17) (18) ${\tilde{W}}_{j, t} = \sum_{l = 0}^{L - 1} {\tilde{h}}_{l} {\tilde{V}}_{j - 1, t - 2^{j - 1} l mod N}, t = 0, 1, 2, 3, \dots, N - 1$ (18) (19) $y_{t} = {\tilde{V}}_{t} + {\tilde{W}}_{t}$ (19) where ${\tilde{V}}_{j, t}$ and ${\tilde{W}}_{j, t}$ are the $j^{th}$ level scaling coefficients and the $j^{th}$ level wavelet coefficients, respectively. The ${\tilde{h}}_{l}$ and ${\tilde{g}}_{l}$ are wavelet and scaling filters of MODWT. Letting ${\tilde{V}}_{0, t} = X_{t}$ and variable L is the length of either wavelet filter $({\tilde{h}}_{l})$ or scaling filter $({\tilde{g}}_{l}) .$ $y_{t}$ is actual observation at time t.

2.7.3. The support vector regression uses the given input lag of scaling coefficients to rearrange the series into m columns. $[\begin{matrix} {\tilde{V}}_{1} & {\tilde{V}}_{2} & {\tilde{V}}_{3} & \dots & {\tilde{V}}_{m} \\ {\tilde{V}}_{2} & {\tilde{V}}_{3} & {\tilde{V}}_{4} & \dots & {\tilde{V}}_{m + 1} \\ {\tilde{V}}_{3} & {\tilde{V}}_{4} & {\tilde{V}}_{5} & \dots & {\tilde{V}}_{m + 2} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ {\tilde{V}}_{t - m + 1} & \dots & {\tilde{V}}_{t - 2} & {\tilde{V}}_{t - 1} & {\tilde{V}}_{t} \end{matrix}]$

2.7.4. The support vector regression adopts the given input lag of wavelet coefficients in order to rearrange into n columns. $[\begin{matrix} {\tilde{W}}_{1} & {\tilde{W}}_{2} & {\tilde{W}}_{3} & \dots & {\tilde{W}}_{n} \\ {\tilde{W}}_{2} & {\tilde{W}}_{3} & {\tilde{W}}_{4} & \dots & {\tilde{W}}_{n + 1} \\ {\tilde{W}}_{3} & {\tilde{W}}_{4} & {\tilde{W}}_{5} & \dots & {\tilde{W}}_{n + 2} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ {\tilde{W}}_{t - n + 1} & \dots & {\tilde{W}}_{t - 2} & {\tilde{W}}_{t - 1} & {\tilde{W}}_{t} \end{matrix}]$

2.7.5. The support vector regression exploits the given parameters of hyperplane to model complex function for scaling coefficient series as presented in EquationEquation (20)(20) ${\hat{\tilde{V}}}_{t} = f ({\tilde{V}}_{t - 1}, {\tilde{V}}_{t - 2}, {\tilde{V}}_{t - 3}, \dots, {\tilde{V}}_{t - m + 1} | θ)$ (20) . (20) ${\hat{\tilde{V}}}_{t} = f ({\tilde{V}}_{t - 1}, {\tilde{V}}_{t - 2}, {\tilde{V}}_{t - 3}, \dots, {\tilde{V}}_{t - m + 1} | θ)$ (20) where $f$ is prediction function of support vector regression for scaling coefficient series.

2.7.6. The given parameters of hyperplane are used to formulate complex function of support vector regression, which is described as EquationEquation (21)(21) ${\hat{\tilde{W}}}_{t} = f ({\tilde{W}}_{t - 1}, {\tilde{W}}_{t - 2}, {\tilde{W}}_{t - 3}, \dots, {\tilde{W}}_{t - n + 1} | θ)$ (21) . (21) ${\hat{\tilde{W}}}_{t} = f ({\tilde{W}}_{t - 1}, {\tilde{W}}_{t - 2}, {\tilde{W}}_{t - 3}, \dots, {\tilde{W}}_{t - n + 1} | θ)$ (21) where $f$ is prediction function of support vector regression for wavelet coefficient series.

2.7.7. Aggregate predicted scaling coefficient and predicted wavelet coefficient to be a future amount of CO₂ emission or CO₂ emission equivalence as presented in EquationEquation (22)(22) $y_{t + 1} = {\hat{\tilde{V}}}_{t + 1} + {\hat{\tilde{W}}}_{t + 1} + ε_{t + 1}$ (22) . (22) $y_{t + 1} = {\hat{\tilde{V}}}_{t + 1} + {\hat{\tilde{W}}}_{t + 1} + ε_{t + 1}$ (22) where $y_{t + 1},$ ${\hat{\tilde{V}}}_{t + 1},$ ${\hat{\tilde{W}}}_{t + 1}$ and $ε_{t + 1}$ are the actual observation, predicted scaling coefficient, predicted wavelet coefficient and error at time t + 1, respectively.

2.7.8. An initial mutant vector of parameters $v_{i, G}$ is generated by selecting randomly three members of the population, $θ_{r_{0}, G},$ $θ_{r_{1}, G},$ and $θ_{r_{3}, G}$ as EquationEquation (23)(23) $v_{i, G} = θ_{i, G} + (θ_{best, G} - θ_{i, G}) + θ_{r_{0}, G} + 0.8 \times (θ_{r_{1}, G} - θ_{r_{2}, G})$ (23) . (23) $v_{i, G} = θ_{i, G} + (θ_{best, G} - θ_{i, G}) + θ_{r_{0}, G} + 0.8 \times (θ_{r_{1}, G} - θ_{r_{2}, G})$ (23) where $θ_{i, G}$ and $θ_{best, G}$ are the ith member of the population at current generation and the best member with the best fitness, respectively. $r_{0},$ $r_{1}$ and $r_{2}$ are randomly selected numbers within the population size. G is the number of generations; and $i = 1, 2, 3, \dots, N .$

2.7.9. The crossover operation creates a trial vector $u_{i, G}$ as EquationEquation (24)(24) $u_{i, G} = {\begin{cases} v_{j, i, G} if ran d_{j, i} \leq 0.5 or j = I_{rand} \\ θ_{j, i, G} otherwise \end{cases}$ (24) . (24) $u_{i, G} = {\begin{cases} v_{j, i, G} if ran d_{j, i} \leq 0.5 or j = I_{rand} \\ θ_{j, i, G} otherwise \end{cases}$ (24) where G is the number of generations; $i = 1, 2, 3, \dots, N; j = 1, 2, 3, \dots, d,$ $ran d_{j, i} \sim U [0, 1],$ $I_{rand}$ is a random integer from $[1, 2, \dots, d] .$

2.7.10. A greedy selection operator is utilized as EquationEquation (25)(25) $θ_{i, G + 1} = {\begin{cases} u_{i, G} if f (u_{i, G}) \leq f (θ_{i, G}) \\ θ_{i, G} otherwise \end{cases}$ (25) . (25) $θ_{i, G + 1} = {\begin{cases} u_{i, G} if f (u_{i, G}) \leq f (θ_{i, G}) \\ θ_{i, G} otherwise \end{cases}$ (25) where $f (θ_{i, G})$ is equal to MAPE of the target vector. $f (u_{i, G})$ is the MAPE of the trial vector. G is the number of generations and $i = 1, 2, 3, \dots, N$

2.7.11. Repeat 2.7.2–2.7.10 until the criterion is met.

Finally, the proposed model returns optimal parameters of support vector regression dealing with both scaling and wavelet coefficients.

3. Cross validation

In order to evaluate forecasting performance of those forecasting models, datasets of CO₂ emission or CO₂ emission equivalence are divided into two datasets, which are 70% and 30% of total data for training and test datasets, respectively. For modeling, the training data are utilized to formulate prediction function and forecast one-step ahead value to compare to the first observation of test data with the purpose of calculating error. After the first observation of test data is known, then it is combined with the current training data for updating training data for modeling and predicting one-step ahead value to compare next observation of test data. This process is continuously conducted to the last observation of test data. The mathematical expressions of five accuracy measures are described as EquationEquations (26)–(30). (26) $MAE = \frac{\sum_{t = 1}^{n} | y_{t} - {\hat{y}}_{t} |}{n}$ (26) (27) $RMSE = \sqrt{\frac{\sum_{t = 1}^{n} {(y_{t} - {\hat{y}}_{t})}^{2}}{n}}$ (27) (28) $RMSPE = \sqrt{\frac{\sum_{t = 1}^{n} {(y_{t} - {\hat{y}}_{t} / y_{t} \times 100)}^{2}}{n}}$ (28) (29) $MAPE = \frac{\sum_{t = 1}^{n} | y_{t} - {\hat{y}}_{t} | / y_{t}}{n} \times 100$ (29) (30) $sMAPE = \frac{\sum_{t = 1}^{n} 2 \times | y_{t} - {\hat{y}}_{t} | / (y_{t} + {\hat{y}}_{t})}{n} \times 100$ (30) where $y_{t}$ and ${\hat{y}}_{t}$ are actual observation and forecast value at time t, respectively.

4. Results and discussion

For experiments of this research by using actual data of CO₂ emission or CO₂ emission equivalence, all empirical results of the five accuracy measures are exploited to indicate better performance of forecasting models, which are summarized and presented in .

Table 2. The summary of five accuracy measures based on CO₂ emission or CO₂ emission equivalence by sector.

Download CSV Display Table

According to results in , each conventional model can perform well in each situation compared to one another. Furthermore, the MAPE is lower than 10%, which indicates more accuracy and can support reliable forecast. However, it is difficult to indicate that one forecasting model outperforms one another due to its advantage and disadvantage of each model. Furthermore, the combined approach of support vector regression and differential evolution outperforms almost accuracy measures compared to conventional models. Although the SVR-DE model performs well than conventional models, the proposed model outperforms all forecasting models in almost accuracy measures.

With regard to support significant difference of forecasting performances, Friedman test, which is nonparametric analysis of variance, is conducted to identify that at least one is significantly different from one another. Afterward, the post hoc pairwise multiple comparison of Conover test is conducted to identify significant difference between forecasting performances. The summary of Friedman test is presented in .

Figure 2. The summary of Friedman test based on MAPE.

Figure 3. The summary of Friedman test based on sMAPE.

Figure 4. The summary of Friedman test based on RMSPE.

Referencing p value of Friedman test based on MAPE is less than 0.05, it is sufficient to conclude that at least model is significantly different from one another at 0.05 significance levels. Moreover, the Conover’s test reveals that the proposed model is significantly different from all forecasting models at 0.05 significance levels. Furthermore, the SVR-DE model is significantly different from all conventional models at 0.05 significance levels, although it cannot outperform the proposed model.

As the results of Friedman test depended on sMAPE, the p value is also less than 0.05 and demonstrates that at least performance is not equal to one another at 0.05 significance levels. In addition, the proposed model is still significantly different from all forecasting models at 0.05 significance levels. In the meantime, the SVR-DE model is still significantly different from all traditional models at 0.05 significance levels.

From results of Friedman test in , it indicates that at least performance is not equal to one another at 0.05 significance levels because p value is less than 0.05. Based on Conover’s test, empirical results reveal that proposed model is still significantly better than all forecasting models at 0.05 significance levels. Conversely, the SVR-DE can significantly overcome only all traditional models at 0.05 significance levels.

5. Conclusion

According to all results of evaluations, the proposed model emphasizing on both noise reduction and optimal parameter selection techniques is significantly superior to all forecasting models. This evidence can support to conclude that noise reduction technique can remove the irrelevant and redundant components of time series, which can provide more explicit and stationary time series than original time series. Moreover, the differential evolution algorithm can search the most proper parameters of support vector regression within given search space. Meanwhile, the SVR-DE model significantly outperforms all conventional models. In other words, these combined approaches can improve more accuracy and provide more reliable forecast to support decision making on those problems of CO₂ emission or CO₂ emission equivalence.

Consequently, the proposed model can be a useful tool for extrapolating future amount of CO₂ emission or CO₂ emission equivalence to support policy planner to take step with helpful guideline. However, the proposed model can provide one year ahead forecast, which may not be suitable for middle or long-term planning.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Additional information

Funding

This research was supported by Research and Graduate Studies, Khon Kaen University.

References

Adnan, R. M., Dai, H. L., Mostafa, R. R., Parmar, K. S., Heddam, S., & Kisi, O. (2022). Modeling multistep ahead dissolved oxygen concentration using improved support vector machines by a hybrid metaheuristic algorithm. Sustainability, 14(6), 3470. https://doi.org/10.3390/su14063470
Web of Science ®Google Scholar
Ahmadi, M. H., Dehghani Madvar, M., Sadeghzadeh, M., Rezaei, M. H., Herrera, M., & Shamshirband, S. (2019). Current status investigation and predicting carbon dioxide emission in Latin American countries by connectionist models. Energies, 12(10), 1916. https://doi.org/10.3390/en12101916
Web of Science ®Google Scholar
Batool, S., Iqbal, J., Ali, A., & Perveen, B. (2022). Causal relationship between energy consumption, economic growth, and financial development: Evidence from South Asian Countries. Journal of Environmental Science and Economics, 1(4), 61–76. https://doi.org/10.56556/jescae.v1i4.319
Google Scholar
Cahyono, W. E., Joy, B., Setyawati, W., & Mahdi, R. (2022). Projection of CO2 emissions in Indonesia. Materials Today: Proceedings, 63(1), S438–S444. https://www.sciencedirect.com/science/article/pii/S2214785322022258
Google Scholar
Climate Watch. (2022). Historical Emissions. https://www.climatewatchdata.org/
Google Scholar
Ehteram, M., Sammen, S. S., Panahi, F., & Sidek, L. M. (2021). A hybrid novel SVM model for predicting CO2 emissions using Multiobjective Seagull Optimization. Environmental Science and Pollution Research International, 28(46), 66171–66192. https://doi.org/10.1007/s11356-021-15223-4
PubMed Web of Science ®Google Scholar
Franco, C., Melica, G., Treville, A., Baldi, M. G., Pisoni, E., Bertoldi, P., & Thiel, C. (2022). Prediction of greenhouse gas emissions for cities and local municipalities monitoring their advances to mitigate and adapt to climate change. Sustainable Cities and Society, 86, 104114. https://doi.org/10.1016/j.scs.2022.104114
Web of Science ®Google Scholar
Ghazvini, M., Dehghani Madvar, M., Ahmadi, M. H., Rezaei, M. H., El Haj Assad, M., Nabipour, N., & Kumar, R. (2020). Technological assessment and modeling of energy‐related CO2 emissions for the G8 countries by using hybrid IWO algorithm based on SVM. Energy Science & Engineering, 8(4), 1285–1308. https://doi.org/10.1002/ese3.593
Web of Science ®Google Scholar
Hou, Y., Zhao, L., & Lu, H. (2018). Fuzzy neural network optimization and network traffic forecasting based on improved differential evolution. Future Generation Computer Systems, 81, 425–432. https://doi.org/10.1016/j.future.2017.08.041
Web of Science ®Google Scholar
Hu, Y. L., & Chen, L. (2018). A nonlinear hybrid wind speed forecasting model using LSTM network, hysteretic ELM and differential evolution algorithm. Energy Conversion and Management, 173, 123–142. https://doi.org/10.1016/j.enconman.2018.07.070
Web of Science ®Google Scholar
Hyndman, R. J., & Khandakar, Y. (2008). Automatic time series forecasting: the forecast package for R. Journal of statistical software, 27, 1–22.
Web of Science ®Google Scholar
Hyndman, R., Athanasopoulos, G., Bergmeir, C., Caceres, G., Chhay, L., O'Hara-Wild, M., Petropoulos, F., Razbash, S., Wang, E., & Yasmeen, F. (2022). forecast: Forecasting functions for time series and linear models. R package version 8.16. https://pkg.robjhyndman.com/forecast/.
Google Scholar
Jaiboon, N., Wongsapai, W., Daroon, S., Bunchuaidee, R., Ritkrerkkrai, C., & Damrongsak, D. (2021). Greenhouse gas mitigation potential from waste heat recovery for power generation in cement industry: The case of Thailand. Energy Reports, 7, 638–643. https://doi.org/10.1016/j.egyr.2021.07.089
Web of Science ®Google Scholar
Javanmard, M. E., & Ghaderi, S. F. (2022). A hybrid model with applying machine learning algorithms and optimization model to forecast greenhouse gas emissions with energy market data. Sustainable Cities and Society, 82, 103886. https://doi.org/10.1016/j.scs.2022.103886
Web of Science ®Google Scholar
Kim, I., & Jang, Y. (2022). Material efficiency and greenhouse gas reduction effect of industrial waste by material circulation in Korea. Journal of Cleaner Production, 376, 134053. https://doi.org/10.1016/j.jclepro.2022.134053
Web of Science ®Google Scholar
Kumari, S., & Singh, S. K. (2022). Machine learning-based time series models for effective CO2 emission prediction in India. Environmental Science and Pollution Research, 30(55), 116601–116616. https://doi.org/10.1007/s11356-022-21723-8
PubMedGoogle Scholar
Leerbeck, K., Bacher, P., Junker, R. G., Goranović, G., Corradi, O., Ebrahimy, R., Tveit, A., & Madsen, H. (2020). Short-term forecasting of CO2 emission intensity in power grids by machine learning. Applied Energy, 277, 115527. https://doi.org/10.1016/j.apenergy.2020.115527
Web of Science ®Google Scholar
Li, X., Ren, A., & Li, Q. (2022). Exploring patterns of transportation-related CO2 emissions using machine learning methods. Sustainability, 14(8), 4588. https://doi.org/10.3390/su14084588
Web of Science ®Google Scholar
Lu, L. C., Chiu, S. Y., Chiu, Y. H., & Chang, T. H. (2022). Sustainability efficiency of climate change and global disasters based on greenhouse gas emissions from the parallel production sectors–A modified dynamic parallel three-stage network DEA model. Journal of Environmental Management, 317, 115401. https://doi.org/10.1016/j.jenvman.2022.115401
PubMed Web of Science ®Google Scholar
Ning, L., Pei, L., & Li, F. (2021). Forecast of China’s carbon emissions based on Arima method. Discrete Dynamics in Nature and Society, 2021, 1–12. https://doi.org/10.1155/2021/1441942
Web of Science ®Google Scholar
Nong, D., Simshauser, P., & Nguyen, D. B. (2021). Greenhouse gas emissions vs CO2 emissions: Comparative analysis of a global carbon tax. Applied Energy, 298, 117223. https://doi.org/10.1016/j.apenergy.2021.117223
Web of Science ®Google Scholar
Raihan, A., & Tuspekova, A. (2022). Nexus between energy use, industrialization, forest area, and carbon dioxide emissions: New insights from Russia. Journal of Environmental Science and Economics, 1(4), 1–11. https://doi.org/10.56556/jescae.v1i4.269
Google Scholar
Raihan, A., & Voumik, L. C. (2022). Carbon emission dynamics in India due to financial development, renewable energy utilization, technological innovation, economic growth, and urbanization. Journal of Environmental Science and Economics, 1(4), 36–50. https://doi.org/10.56556/jescae.v1i4.412
Google Scholar
Seo, Y., Choi, Y., & Choi, J. (2017). River stage modeling by combining maximal overlap discrete wavelet transform, support vector machines and genetic algorithm. Water, 9(7), 525. https://doi.org/10.3390/w9070525
Web of Science ®Google Scholar
Shams, M. A., Anis, H. I., & El-Shahat, M. (2021). Denoising of heavily contaminated partial discharge signals in high-voltage cables using maximal overlap discrete wavelet transform. Energies, 14(20), 6540. https://doi.org/10.3390/en14206540
Web of Science ®Google Scholar
Sovacool, B. K., Griffiths, S., Kim, J., & Bazilian, M. (2021). Climate change and industrial F-gases: A critical and systematic review of developments, sociotechnical systems and policy options for reducing synthetic greenhouse gas emissions. Renewable and Sustainable Energy Reviews, 141, 110759. https://doi.org/10.1016/j.rser.2021.110759
Web of Science ®Google Scholar
Srikaummun, N., Wongsapai, W., Damrongsak, D., Thepsaskul, W., Ritkrekkrai, C., Bunchuaidee, R., Tridech, N., & Juprasert, P. (2021). Greenhouse gas mitigation and electricity saving potential from replacing refrigerants in Thai refrigerator. Energy Reports, 7, 98–104. https://doi.org/10.1016/j.egyr.2021.07.138
Web of Science ®Google Scholar
Wongsapai, W., & Daroon, S. (2021). Estimation of greenhouse gas mitigation potential from carbon intensity and energy data analysis from Thai industrial sector. Energy Reports, 7, 930–936. https://doi.org/10.1016/j.egyr.2021.07.048
Web of Science ®Google Scholar
Xu, H., Pan, X., Guo, S., & Lu, Y. (2021). Forecasting Chinese CO2 emission using a non-linear multi-agent intertemporal optimization model and scenario analysis. Energy, 228, 120514. https://doi.org/10.1016/j.energy.2021.120514
Web of Science ®Google Scholar
Yang, Z., Ren, J., Ma, S., Chen, X., Cui, S., & Xiang, L. (2022). The emission-inequality Nexus: Empirical evidence from a wavelet-based quantile-on-quantile regression approach. Frontiers in Environmental Science, 10, 871846. https://doi.org/10.3389/fenvs.2022.871846
Web of Science ®Google Scholar
Yoro, K. O., & Daramola, M. O. (2020). CO2 emission sources, greenhouse gases, and the global warming effect. Advances in carbon capture (pp. 3–28). Woodhead Publishing.
Google Scholar
Zhang, F., Deb, C., Lee, S. E., Yang, J., & Shah, K. W. (2016). Time series forecasting for building energy consumption using weighted support vector regression with differential evolution optimization technique. Energy and Buildings, 126, 94–103. https://doi.org/10.1016/j.enbuild.2016.05.028
Web of Science ®Google Scholar

A hybridization of MODWT-SVR-DE model emphasizing on noise reduction and optimal parameter selection for prediction of CO₂ emission in Thailand

Abstract

1. Introduction

Table 1. List of abbreviations and acronyms is used in this research.

2. Material and methodologies