Full article: Short-term power load forecasting using SSA-CNN-LSTM method

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

The short-term power load forecasting provides an essential foundation for the dispatching management of the power system, which is crucial for enhancing economy and ensuring operational stability. To enhance the precision of the short-term power load forecasting, this paper proposes a hybrid prediction algorithm based on sparrow search algorithm (SSA), convolutional neural network (CNN) and long short-term memory (LSTM). First, feature datasets are constructed based on date information, meteorological data, similar days. The CNN performs effective feature extraction on the data and feeds the results into the LSTM for time series data analysis. Second, eight key parameters are optimized by SSA for improving the prediction precision of the CNN-LSTM prediction model. Simulation results show that the R2 of the proposed model exhibits a substantial enhancement in comparison to other models, reaching 0.9919 and presents a remarkable decrease in MAPE resulting in a value of 1.2%. Furthermore, RMSE and MAE have decreased to 1.17MW and 0.97MW respectively. Therefore, the proposed method has the ability to improve the prediction accuracy, due to the advantages in data mining of CNN, good time series data fitting ability of LSTM, and excellent optimization ability of SSA.

KEYWORDS:

1. Introduction

In recent years, the utilization of renewable energy has witnessed a remarkable surge. The interconnection of distributed energy resources has also increased which led to an augmented level of complexity and instability in the power grid. In addition, the rapid development of smart grid intensifies these challenges, requiring more sophisticated management techniques and more accurate load forecasting technology. Generally, power load forecasting can be broadly classified into three types based on the time span and operation decision, namely, short-term, medium-term and long-term forecasting. Short-term load forecasting (STLF) serves as a pivotal and indispensable element in daily operations, unit commitment (UC), scheduling functions, assessment of net interchange, and analysis of system security (Srinivasan & Lee, Citation1995). The prediction of STLF, spanning from hours to a few days ahead, holds immense significance as it serves as an essential element in the operation and dispatch of power systems. It offers decision support for the operation and dispatch of power companies. Precise short-term load forecasting results not only contributes to the secure and dependable operation of the system, but also mitigates resource inefficiencies and enhances economic efficiency (Jin et al., Citation2020). Therefore, the improvement of short-term load forecasting accuracy has significant practical implications (Chen et al., Citation2017).

Currently, the statistics methods of power load forecasting primarily include autoregressive integrated moving average (ARIMA) (Conejo et al., Citation2005; Nury et al., Citation2017), exponential smoothing (Mi et al., Citation2018) Kalman filter (Tian et al., Citation2018) and linear regression (Dhaval & Deshpande, Citation2020). The ARIMA approach was employed to predict the daily power consumption, followed by the utilization of support vector machines (SVMs) to rectify the disparities in the initial prediction (Nie et al., Citation2012). A hybrid Kalman filter was introduced for power load forecasting, taking into account the prediction interval estimates (Guan et al., Citation2017). It could be extended to other seasonal data and diverse forecasting models. In the reference (Wang et al., Citation2017), multiple linear regression was employed for load forecasting. Although these methods are straightforward and easily executable, they impose stringent demands on the quality of the input data, which renders them incapable of adequately capturing the influence of nonlinear factors.

Recently, the swift progress of artificial intelligence has led to the extensive utilization of deep learning algorithms for power load prediction. These algorithms are known for their strong non-linear mapping and adaptive capabilities, and have consistently achieved excellent results. Typical methods include deep neural network method, such as convolutional neural network (CNN) technology and recurrent neural network (RNN) technology, etc. CNN is a neural network that encompasses convolutional computations and possesses a deep structure. It has found wide application in various fields such as image and speech processing (Hao et al., Citation2018). The CNN algorithm was utilized for extracting features from the input data and capturing the cyclic patterns associated with seasonal variations, enhancing the accuracy of load forecasting (Dong et al., Citation2017b; Wang et al., Citation2019). It has been observed that CNN models demonstrate heightened precision when applied to intricately nonlinear sequences. CNN, with its distinctive network architecture comprising convolutional layers, nonlinear activation functions and pooling layers, enables it to effectively capture nonlinear features, addressing the limitation of traditional algorithms in capturing such factors.

The improved long short-term memory (LSTM) based on RNN make a particularly prominent contribution to the processing of long sequence. In (Lin et al., Citation2022; Zheng et al., Citation2017), the utilized method based on LSTM can indicate that LSTM exhibits accurate forecasting capabilities for complex electric load time series, even for long forecasting horizons. However, there are two deficiencies in the practical application of LSTM for the short-load forecasting. On one hand, it may encounter challenges in identifying potential relationships and meaningful information among non-adjacent data points (Ghimire et al., Citation2021; Rafi et al., Citation2021; Tian et al., Citation2019). On the other hand, the parameters in the model are usually selected by manual experience, and in this way could be poor universality and high uncertainty, which limit the accuracy of load forecasting prediction (Liao et al., Citation2019; Liu et al., Citation2014). The introduction of gated mechanisms in LSTM enables its memory unit to effectively retain and update information over a longer period of time, accommodating inputs from multiple time steps. This empowers it with the capability to process long sequences of data efficiently and capture long-term dependencies and dynamic variations within the sequence. CNN's proficiency in dealing with non-linear data mitigates LSTM's limitation in processing non-adjacent data points.

The sparrow search algorithm (SSA) is a new swarm intelligence optimization algorithm and draws inspiration from the predatory behaviour of sparrows (Xue & Shen, Citation2020). In reference (An et al., Citation2021), SSA was employed for optimizing the parameters of deep extreme learning machine (DELM) effectively addressing the issue of random fluctuations in the weights and thresholds of the model. The experimental findings showed that the suggested method achieves superior performance for wind power forecasting. SSA is also applied in the field of photovoltaic power generation prediction (Ma et al., Citation2022). Some traditional optimization algorithms, such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), have certain drawbacks, including a tendency to get trapped in local optima, slower convergence speed and relatively low adaptability. Compared to traditional optimization algorithms, SSA addresses the aforementioned shortcomings and has advantages such as strong global optimization ability, good adaptability and simplicity in implementation. Therefore, this paper chooses the SSA algorithm as the preferred optimization method to optimize the parameters in the aforementioned model, thus bolstering the model's stability.

Inspired by the aforementioned research, a hybrid prediction algorithm is proposed that combines the SSA with CNN and LSTM algorithm. The load data and other influencing factors data put into the CNN-LSTM model. The SSA algorithm is employed to optimize parameters of network for enhancing the precision of load prediction models. The main contributions of this paper are outlined below.

The feature set encompasses climate factors, date factors and similar daily load factors, allowing for the effective utilization of CNN networks’ strengths in the domain of data mining. This approach enables the extraction of potential relationships from non-discontinuous data in a high-dimensional space.
The SSA is combined with the CNN-LSTM hybrid neural network for the first time to forecast power load. To ascertain the most suitable parameters for the network, the SSA approach is employed to overcome heightened uncertainty associated with manual experience in selection. In addition, key parameters, including eight parameters for the CNN-LSTM prediction model, are optimized by SSA for improving the prediction precision.

This paper is structured into the subsequent sections. Section 2 constructs the feature set. Section 3 describes SSA, CNN and LSTM methodologies, and introduces the proposed SSA-CNN-LSTM power prediction model. Section 4 showcases the numerical outcomes from actual data. It also contrasts the performance of the proposed hybrid model with other predictive models. Section V concludes the paper.

2. Load influencing factors

Reasonable establish prediction feature set is significant in ensuring the accuracy and convergence of the model’s predictions (Lei et al., Citation2016). Within this paper, the power load is represented as follows: (1) $L (t) = L_{n} (t) + L_{w} (t) + L_{s} (t) + L_{d} (t)$ (1) In Equation (1), $L (t)$ is the actual load value at time t, $L_{n} (t)$ is the original predicted load trend at time t, $L_{w} (t)$ is the consistent fluctuation caused by climatic factors, $L_{s} (t)$ is the load fluctuation caused by date factor, $L_{d} (t)$ is the load fluctuation caused by similar daily factors. From the equation, it is evident that the overall prediction accuracy can be enhanced by improving the prediction accuracy of $L_{w} (t)$ , $L_{s} (t)$ and $L_{d} (t)$ . This paper takes the original load data, climate factors, date factors and similar day factors as inputs for model training, as illustrated in Table .

Table 1. Load influencing factors.

Download CSV Display Table

3. Methods

3.1. Sparrow search algorithm

The SSA is primarily based on the behaviour of sparrows as predators (Wang & Xian, Citation2021). To simplify the analysis, the behaviour of sparrows has been idealized, and rules corresponding to this idealization have been formulated. (2) $X = [\begin{matrix} x_{1, 1} & x_{1, 2} & \dots & x_{1, d} \\ x_{2, 1} & x_{2, 2} & \dots & x_{2, d} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ x_{n, 1} & x_{n, 2} & \dots & x_{n, d} \end{matrix}]$ (2) Let n represent the number of sparrows in the population and let d denote the dimension of the variable to be optimized, the location information of sparrows can be abstracted as following matrix (2):

The fitness value of all every sparrow denotes the subsequent vector: (3) $F_{x} = [\begin{matrix} f ([x_{1, 1} & x_{1, 2} & \dots & x_{1, d}]) \\ f ([x_{2, 1} & x_{2, 2} & \dots & x_{2, d}]) \\ ⋮ & ⋮ & ⋮ & ⋮ \\ f ([x_{n, 1} & x_{n, 2} & \dots & x_{n, d}]) \end{matrix}]$ (3) The producers update the location by Equation (4): (4) $X_{i, j}^{t + 1} = {\begin{cases} X_{i, j}^{t} \cdot exp (\frac{- i}{α \cdot ite r_{max}}) & if R_{2} < ST \\ X_{i, j}^{t} + Q \cdot L & if R_{2} \geq ST \end{cases}$ (4) where $X_{i, j}^{t + 1}$ represents the value of the jth dimension of the ith sparrow at iteration t, $α \in (0, 1]$ is a random number, Q is a random number, following the normal distribution, L is a multidimensional matrix whose elements are all 1, $R_{2} \in [0, 1]$ is alarm value, $ST \in [0.5, 1]$ represents the safety threshold respectively. When $R_{2} < ST$ , indicating the absence of predators, the producer switches to wide search mode. If $R_{2} \geq ST$ , all sparrows should swiftly fly to other safe regions as it suggests that some sparrows have detected the predator.

The scroungers update the location are outlined as follows: (5) $X_{i, j}^{t + 1} = {\begin{cases} Q \cdot exp (\frac{X_{worst}^{t} - X_{i, j}^{t}}{i^{2}}) & i > \frac{n}{2} \\ X_{P}^{t + 1} + | X_{i, j}^{t} - X_{P}^{t + 1} | \cdot A^{+} \cdot L & otherwise \end{cases}$ (5) where $X_{P}$ is the producer's ideal position, as held by the producer, $X_{worst}^{t}$ represents the worst location, A is a 1×d matrix where each element is either 1 or – 1, and $A^{+} = A^{T} (A A^{T})^{- 1}$ . When $i > n / 2$ , this implies that the scrounger with the lowest fitness value, denoted as the ith individual, is the most vulnerable to experiencing starvation. In otherwise, the ith scroungers will follow the finder's foraging centre and forage around the centre at random.

Assuming that dangerous sparrows constitute a range of 10–20% within the population, and the expressions of their position are as follows: (6) $X_{i, j}^{t + 1} = {\begin{cases} X_{best}^{t} + β \cdot | X_{i, j}^{t} - X_{best}^{t} | & if f_{i} > f_{g} \\ X_{i, j}^{t} + K \cdot (\frac{X_{i, j}^{t} - X_{best}^{t}}{(f_{i} - f_{ω}) + ϵ}) & if f_{i} = f_{g} \end{cases}$ (6) where $X_{best}^{t}$ represents the global optimal solution in the population, $β$ is the control parameter for step size, it represents a Gaussian distribution of stochastic values with a variance of 1 and a mean value of 0, $f_{i}$ presents the current fitness value of sparrow. The best and worst fitness numbers currently available are $f_{g}$ and $f_{ω}$ , respectively. $ϵ$ serves as the minimal constant to prevent zero-division errors.

3.2. CNN network model

One of the more established algorithms in the area of deep learning is CNN. Thanks for its internal weight-sharing and local connection structures, CNN can effectively extract the in-depth features included in the data. At the same time, it can reduce the complexity of the algorithm (Dong et al., Citation2017a; Sajjad et al., Citation2020; Yan et al., Citation2020). The convolution layer of CNN convolutes the data and extracts potential features, and the pooling layer down samples and compresses the network parameters. Alternating between convolution layers and pooling layers allows for the efficient extraction of potential features from input data while minimizing mistakes caused by artificial feature extraction. Therefore, this paper utilizes CNN to extract the features, and then transfer them to LSTM network for prediction. The figure illustrates the structure diagram of CNN as depicted in Figure and the parameters of CNN are as depicted in the following Table .

Figure 1. Structure diagram of CNN.

Table 2. Parameters of CNN.

Download CSV Display Table

3.3. Long short-term memory

The key parts of LSTM architecture are memory units and nonlinear gate units, which maintain storage over time and help control the information flow. The three gating units are input gate $i_{t}$ , forget gate $f_{t}$ and output gate $o_{t}$ , respectively. The forget gate screens out useful information, filters out useless information and controls the state of the system to prevent memory saturation. The amount of data currently stored in the unit state is determined by the input gate. Long-term memory's impact on the current output is managed by the output gate. The unit structure diagram of LSTM is shown in Figure .

Figure 2. Structure diagram of LSTM.

The following equations can be used to calculate the input gate $i_{t}$ and forget gate $f_{t}$ : (7) $\begin{aligned} i_{t} & = σ [W_{i} \cdot (h_{t - 1}, x_{t}) + b_{i}] \end{aligned}$ (7) (8) $\begin{aligned} f_{t} & = σ [W_{f} \cdot (h_{t - 1}, x_{t}) + b_{f}] \end{aligned}$ (8) where $σ_{}$ represents the sigmoid function, $W_{f}$ and $W_{i}$ represent the weight matrices, $h_{t - 1}$ denotes the resulting output of the previous cell, $x_{t}$ signifies the input data, and $b_{i}$ and $b_{f}$ denote as the bias vectors.

The subsequent step involves updating the cell state $C_{t}$ and $i_{t}$ can be computed as: (9) $C_{t} = f_{t} \cdot C_{t - 1} + i_{t} \cdot tan h [W_{c} \cdot (h_{t - 1}, x_{t}) + b_{c}]$ (9) where $W_{c}$ is the weight matrix, $C_{t - 1}$ denotes the state of previous cell, $b_{c}$ is the bias vector.

The output gate $o_{t}$ and the final output $h_{t}$ is represented: (10) $\begin{aligned} o_{t} & = σ [W_{o} \cdot (h_{t - 1}, x_{t}) + b_{o}] \end{aligned}$ (10) (11) $\begin{aligned} h_{t} & = o_{t} \times tan h (C_{t}) \end{aligned}$ (11) where $W_{o}$ is the weight matrix and $b_{o}$ is the bias vector.

3.4. SSA-CNN-LSTM model

A short-term power load forecasting model, called SSA-CNN-LSTM, is presented by combining SSA algorithm. The structure diagram is depicted in Figure . Initially, the load data and other influencing factors data transfers into the CNN-LSTM neural network. Subsequently, the two convolution layers of CNN are used to extract more representative features. The feature sequence is then fed into the pooling layer, resulting in a feature sequence with enhanced expressive power. Next, the useful features extracted by the CNN network are input into the LSTM network for comprehensive processing. The LSTM layer of LSTM memorizes and filters the integrated data features, models the nonlinear data patterns and then predicts the load value at 24-time steps into the future through the fully connected layer. Meanwhile, SSA is employed to optimize the values of the key parameters in CNN-LSTM. By considering all factors, the results are adjusted to approach the optimal value more rapidly and converge near it. Subsequently, the optimal parameter configuration is obtained and assigned to the CNN-LSTM hybrid neural network. Finally, the optimal parameter configuration is obtained and assigned to the CNN-LSTM hybrid neural network.

Figure 3. Flowchart of the SSA-CNN-LSTM algorithm.

The specific process of optimization is as follows.

Set the parameters related to the SSA algorithm: number of populations, number of iterations, safety threshold, proportion of discoverers in the sparrow population, optimization dimensions, fitness function, etc;
Assign initial values to the eight parameters that need optimization in the CNN-LSTM model and set the range of optimization parameters;
Calculate the fitness function for each sparrow's position;
Rank the sparrows based on their fitness function and select the sparrow with the highest fitness value as the current optimal solution;
Compare the current optimal value with the previously saved optimal value from the previous iteration. If the fitness function value is higher this time, update the global optimal parameters;
Determine if convergence or the maximum number of iterations has been reached. If not, return to step (3); if the termination condition is met, end the optimization and find the optimal values for the parameters.

The eight key parameters for SSA optimization include the learning rate, the number of iterations, core size of two convolution layers, the number of cores of two convolution layers, the quantity of neurons for the LSTM layer and the number of neurons in full connection layer. By precisely optimizing the key parameters, one can enhance the stability, fitting capacity and predictive accuracy of the model, consequently attaining superior model performance. By selecting an appropriate learning rate, the model can achieve faster convergence during the training process while avoiding the issues of gradient explosion or vanishing, thereby enhancing the stability of the model. Adjusting the number of iterations allows the model to better adapt to the dataset and improve its fitting capability. Too many iterations may lead to overfitting, while too few iterations may result in under fitting. Therefore, it is crucial to choose an optimal number of iterations to strike a balance between fitting ability and generalization. By modifying the size and quantity of the convolutional layers’ kernels, the model can extract features better, thus enhancing its feature representation capability. Modifying the number of neurons in the LSTM layer can influence the model's modelling ability, thereby influencing its handling of data. Adjusting the number of neurons in the full connection layer allows for flexible control over the output dimensions and model complexity, achieving a balance between the model's expressive power and computational complexity.

4. Simulation results

To demonstrate the effectiveness and feasibility of the prediction method proposed in this essay, the power load data of Zhejiang province from February 13, 2010 to May 20, 2010 are picked as the data set. Collect the average temperature, maximum temperature, minimum temperature, maximum humidity, relative humidity, week type, similar day data and 24 hours (interval of 1H) load data, with a total of 3168 data points, The data were divided into 96 samples. Among them, February 13 to May 2 are used as the training set; May 3 to May 20 are taken as the test set, and three representative dates of May 3 (holidays), May 16 (Saturday) and May 20 (working days) are chosen to compare with the prediction results of other prediction models. The experimental setup of this study includes the following specifications. Operating system: Windows 1 l. Processor: Intel Core i9-13900Hx 5.4 GHz. Graphics card: NVIDIAGeForce RTX 4060. Memory: 16 GB. The propoesd models mentioned in this paper are built in the Pycharm 2022.2 environment, using the Python 3.11 programming language.

4.1. Data preprocessing

In order to expedite the convergence of gradient descent and achieve the most optimal solution, the power load data shall be normalized. This normalization process involves employing the maximum and minimum standardization technique to scale all input attributes within the range of [0,1]. In this way, the stability of attributes with very small variance can be enhanced, and the entries with 0 in the sparse matrix can also be maintained. The normalization equation is: (12) $x^{'} = (x - x_{min}) / (x_{max} - x_{min})$ (12) where $x^{'}$ is the normalized value, $x_{max}$ and $x_{min}$ are the maximum and minimum values of load data respectively.

4.2. SSA optimization

In this paper, SSA optimizes 8 parameters, their optimization ranges are set as follows: learning rate (0.01 ∼ 0.001), the number of iterations (1∼100), core size of two convolution layers (1∼5), the number of cores of two convolution layers (1∼16), the quantity of neurons for the LSTM layer (1∼100) and number of neurons in full connection layer (1∼100). The parameter settings for SSA are as follows: the sparrow population number pop = 50; the upper limit for the iterations M = 100; the safety threshold ST = 0.8; among the sparrow population, 20% are producers.; the optimization dimension dim = 8 (Rehman et al., Citation2020).

The SSA optimization process is illustrated in Figure . The green curve in the figure is the iterative process curve on May 3; The blue curve is the iterative process curve on May 16; The red curve is the iterative process curve on May 20, which has converged to the ideal value. The set values of optimization parameters are presented in Table .

Figure 4. Iterative process diagram of optimization parameters: (a) Iterative process figure of learning rate; (b) Iterative process figure of the number of iterations; (c) Iterative process figure of the first convolution kernel number; (d) Iterative process figure of the first convolution kernel size; (e) Iterative process figure of the second convolution kernel number; (f) Iterative process figure of the second convolution kernel size;(g) Iterative process figure of the LSTM node number; (h) Iterative process figure of the FC node number.

Table 3. Optimization parameter setting of sparrow search algorithm.

Download CSV Display Table

4.3. Model evaluation criteria

With the aim of checking the precision of the proposed method in this research, MAPE, RMSE, MAE and $R^{2}$ are used to represent the errors of different model effects (Qin et al., Citation2021), and their expressions are as follows: (13) $\begin{aligned} MAPE = \frac{1}{N} \sum_{i = 1}^{N} \frac{| {\hat{y}}_{i} - y_{i} |}{y_{i}} % \end{aligned}$ (13) (14) $\begin{aligned} RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - \hat{y})}^{2}} \end{aligned}$ (14) (15) $\begin{aligned} MAE = \frac{1}{N} \sum_{i = 1}^{N} | {\hat{y}}_{i} - y_{i} | \end{aligned}$ (15) (16) $\begin{aligned} R^{2} = 1 - \frac{\sum_{i} {({\hat{y}}_{i}^{(i)} - y^{(i)})}^{2}}{\sum_{i} {(\bar{y} - y^{(i)})}^{2}} \end{aligned}$ (16) where $y_{i}$ is the actual value, ${\hat{y}}_{i}$ is the predicted value; N is the number of samples, $\sum_{i} (\bar{y} - y^{(i)})^{2}$ is the benchmark model. As $R^{2} \leq 1$ approaches a value closer to 1, the model's predicted value improves.

4.4. The comparison of different models

For the purpose of proving the superiority of this algorithm, LSTM, CNN-LSTM, SSA-CNN-LSTM algorithm are compared with proposed method. A comparison curve is generated by comparing the predicted value with the actual value, as shown in Figures .

Figures showcase the outcome projections obtained from the LSTM algorithm, the CNN-LSTM algorithm, and the SSA-CNN-LSTM algorithm. The simulations indicate that the LSTM model’s anticipation outcomes exhibit a relatively lower degree in comparison to the actual value, while demonstrating a fundamentally consistent trend with the actual value. The projected value of CNN-LSTM model closely aligns with the actual value in the trough of the wave, surpassing the accuracy of the LSTM model at the wave crest. The SSA-CNN-LSTM model exhibits the most fitting effect, generally corresponding to the authentic magnitude of power load. As the true value fluctuates, the projected value closely follows, resulting in an improved fitting effect. Experimental results highlight the considerable precision of the proposed model, showcasing its robustness. The MAPE, RMSE, MAE, and R² values of the above model have been calculated and are presented in Table .

Figure 5. Comparison graph between predicted load and actual load on May 3.

Figure 6. Comparison graph between predicted load and actual load on May 16.

Figure 7. Comparison graph between predicted load and actual load on May 20.

Table 4. Comparison results of short-term power load forecasting for different models.

Download CSV Display Table

The table illustrates the performance metrics of different algorithms for short-term power forecasting for holidays situation (May 3), specifically the LSTM algorithm, CNN-LSTM algorithm, and SSA-CNN-LSTM algorithm. For the LSTM algorithm, the MAPE is 6.1%, RMSE is 4.44 MW, MAE is 3.99 MW, and R² is 0.9914. For the CNN-LSTM algorithm, the MAPE is 2.9%, RMSE is 2.05 MW, MAE is 1.77 MW, and R² is 0.9782. Comparatively, the proposed algorithm outperforms both the LSTM and CNN-LSTM models. The MAPE for the proposed method stands at a mere 1.6%, which is a reduction of 4.5% and 1.3% compared to the LSTM and CNN-LSTM models, respectively. The RMSE for the proposed model is 1.28 MW, demonstrating a reduction of 3.16 MW and 0.77 MW in comparison. The MAE for the proposed model is 1.02 MW, showing a decrease of 2.97 MW and 0.75 MW, respectively. Additionally, the R² value for the proposed model is 0.9914, representing an increase of 0.0942 and 0.0132 in comparison to the LSTM and CNN-LSTM models, respectively.

Regarding the situation of weekend (May 16), the LSTM algorithm model demonstrates MAPE, RMSE and MAE values of 3.6%, 3.6 MW, and 2.95 MW, respectively, with an R² score of 0.9287. Conversely, the CNN-LSTM algorithm model exhibits corresponding values of 2.5%, 2.1 MW and 1.82 MW, and achieves an R² score of 0.9769. Comparatively, the proposed algorithm model showcases a MAPE value of 2%, thereby decreasing by 1.6% and 0.5%, respectively, in comparison to the LSTM and CNN-LSTM models. Furthermore, the RMSE for the presented model decreases by 0.6 MW and 0.3 MW, respectively, amounting to a value of 1.8 MW. Additionally, the MAE of the proposed model decreases by 1.36 MW and 0.23 MW, respectively, to a value of 1.59 MW. Lastly, the R² score of the proposed model experiences an increase of 0.052 and 0.0038, respectively, resulting in a value of 0.9807.

Concerning the situation of weekdays (May 20), in comparison to both the LSTM and CNN-LSTM models, the presented algorithm model presents a remarkable decrease in MAPE by 1.4% and 0.8%, respectively, resulting in a value of 1.2%. Furthermore, the proposed model experiences a significant reduction in RMSE by 1.06 MW and 0.59 MW, respectively, arriving at 1.17 MW. Similarly, the MAE decreases by 1.06 MW and 0.59 MW, respectively, obtaining a value of 0.97 MW. Most notably, the R² of the proposed model exhibits a substantial enhancement, increasing by 0.0247 and 0.0134, respectively, to reach a value of 0.9919.

5. Conclusions

The novel SSA-CNN-LSTM algorithm is proposed, taking into account the temporal and nonlinear characteristics of short-term power load data. The utilization of the SSA algorithm optimizes the important parameters of the algorithm, thereby reducing the reliance on human intervention in parameter selection. The objective of this study is to showcase the effectiveness and feasibility of the forecasting model based on various scenarios such as holidays, weekends, and working days for load forecasting. The model’s performance is then compared to that of the LSTM and CNN-LSTM models. The simulation results confirm the model’s considerable robustness and accurate forecasting ability, thereby endowing the actual power system with invaluable theoretical insights and technical support.

Acknowledgements

The authors sincerely thank the National Natural Science Foundation of China for their financial support.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This project was supported by the National Natural Science Foundation of China (NSFC) (61673281, 61903264) and Scientific Research Funding Project of Liaoning Province, China (LJKZ0689), the Natural Science Foundation of Liaoning Province (2019-KF-03-01).

References

An, G., Jiang, Z., Chen, L., Cao, X., Li, Z., Zhao, Y., & Sun, H. (2021). Ultra short-term wind power forecasting based on sparrow search algorithm optimization deep extreme learning machine. Sustainability, 13(18), 10453. https://doi.org/10.3390/su131810453
Web of Science ®Google Scholar
Chen, Y., Xu, P., Chu, Y., Li, W., Wu, Y., Ni, L., Bao, Y., & Wang, K (2017). Short-term electrical load forecasting using the support vector regression (SVR) model to calculate the demand response baseline for office buildings. Applied Energy, 195, 659–670. https://doi.org/10.1016/j.apenergy.2017.03.034
Web of Science ®Google Scholar
Conejo, A. J., Plazas, M. A., Espinola, R., & Molina, A. B. (2005). Day-ahead electricity price forecasting using the wavelet transform and ARIMA models. IEEE Transactions on Power Systems, 20(2), 1035–1042. https://doi.org/10.1109/TPWRS.2005.846054
Web of Science ®Google Scholar
Dhaval, B., & Deshpande, A. (2020). Short-term load forecasting with using multiple linear regression. International Journal of Electrical and Computer Engineering (IJECE), 10(4), 3911–3917. https://doi.org/10.11591/ijece.v10i4.pp3911-3917
Google Scholar
Dong, X., Qian, L., & Huang, L. (2017a). A CNN based bagging learning approach to short-term load forecasting in smart grid. In 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 1–6.
Google Scholar
Dong, X., Qian, L., & Huang, L. (2017b). Short-term load forecasting in smart grid: A combined CNN and K-means clustering approach. In 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), IEEE, February 2017, pp. 119–125.
Google Scholar
Ghimire, S., Yaseen, Z. M., Farooque, A. A., Deo, R. C., Zhang, J., & Tao, X. (2021). Streamflow prediction using an integrated methodology based on convolutional neural network and long short-term memory networks. Scientific Reports, 11(1), 17497. https://doi.org/10.1038/s41598-021-96751-4
PubMed Web of Science ®Google Scholar
Guan, C., Luh, P. B., Michel, L. D., & Chi, Z. Y. (2017). Hybrid Kalman filters for very short-term load forecasting and prediction interval estimation. IEEE Transactions on Power Systems, 28, 3806–3817. https://doi.org/10.1109/TPWRS.2013.2264488
Google Scholar
Hao, Z., Liu, G., & Zhang, H. (2018). Correlation filter-based visual tracking via adaptive weighted CNN features fusion. IET Image Processing, 12(8), 1423–1431.
Web of Science ®Google Scholar
Jin, Y., Guo, H., Wang, J., & Song, A. (2020). A hybrid system based on LSTM for short-term power load forecasting. Energies, 13, 6241. https://doi.org/10.3390/en13236241
Google Scholar
Lei, W., Shu, L., Li, L., & Jian, M. (2016, August). Studies on load characteristics of Tianjin power grid during the twelfth five-year. In 2016 China International Conference on Electricity Distribution (CICED). IEEE, 1–5.
Google Scholar
Liao, X., Kang, X., Li, M., & Cao, N. (2019, January). Short term load forecasting and early warning of charging station based on PSO-SVM. In 2019 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS) (pp. 305–308). IEEE.
Google Scholar
Lin, J., Ma, J., Zhu, J., & Cui, Y. (2022). Short-term load forecasting based on LSTM networks considering attention mechanism. International Journal of Electrical Power & Energy Systems, 137, 107818. https://doi.org/10.1016/j.ijepes.2021.107818
Web of Science ®Google Scholar
Liu, N., Tang, Q., Zhang, J., Fan, W., & Liu, J. (2014). A hybrid forecasting model with parameter optimization for short-term load forecasting of micro-grids. Applied Energy, 129(sep.15), 336–345. https://doi.org/10.1016/j.apenergy.2014.05.023
Google Scholar
Ma, W., Qiu, L., Sun, F., Ghoneim, S. S., & Duan, J. (2022). PV power forecasting based on relevance vector machine with sparrow search algorithm considering seasonal distribution and weather type. Energies, 15(14), 5231. https://doi.org/10.3390/en15145231
Web of Science ®Google Scholar
Mi, J., Fan, L., Duan, X., & Qiu, Y. (2018). Short-term power load forecasting method based on improved exponential smoothing grey model. Mathematical Problems in Engineering, 2018.
Web of Science ®Google Scholar
Nie, H., Liu, G., Liu, X., & Wang, Y. (2012). Hybrid of ARIMA and SVMs for short-term load forecasting. Energy Procedia, 16, 1455–1460.
Google Scholar
Nury, A. H., Hasan, K., & Alam, M. J. B. (2017). Comparative study of wavelet-ARIMA and wavelet-ANN models for temperature time series data in northeastern Bangladesh. Journal of King Saud University-Science, 29(1), 47–61. https://doi.org/10.1016/j.jksus.2015.12.002
Web of Science ®Google Scholar
Qin, G., Yan, Q., Zhu, J., Xu, C., & Kammen, D. M. (2021). Day-ahead wind power forecasting based on wind load data using hybrid optimization algorithm. Sustainability, 13(3), 1164. https://doi.org/10.3390/su13031164
Web of Science ®Google Scholar
Rafi, S. H., Deeba, S. R., & Hossain, E. (2021). A short-term load forecasting method using integrated CNN and LSTM network. IEEE Access, 9, 32436–32448. https://doi.org/10.1109/ACCESS.2021.3060654
Google Scholar
Rehman, A., Athar, A., Khan, M. A., Abbas, S., Fatima, A., & Saeed, A. (2020). Modelling, simulation, and optimization of diabetes type II prediction using deep extreme learning machine. Journal of Ambient Intelligence and Smart Environments, 12, 1–14.
Web of Science ®Google Scholar
Sajjad, M., Khan, Z. A., Ullah, A., Hussain, T., Ullah, W., Lee, M. Y., & Baik, S. W. (2020). A novel CNN-GRU-based hybrid approach for short-term residential load forecasting. Ieee Access, 8, 143759–143768. https://doi.org/10.1109/ACCESS.2020.3009537
Google Scholar
Srinivasan, D., & Lee, M. A. (1995). Survey of hybrid fuzzy neural approaches to electric load forecasting. In 1995 IEEE International Conference on Systems, Man and Cybernetics, 22–25 October 1995. Canada.
Google Scholar
Tian, C., Ma, J., Zhang, C., & Zhan, P. (2018). A deep neural network model for short-term load forecast based on long short-term memory network and convolutional neural network. Energies, 11(12), 3493. https://doi.org/10.3390/en11123493
Web of Science ®Google Scholar
Tian, Y. J., Shen, H. X., & Gao, W. G. (2019). Short-term forecasting of electric load based on Kalman filter with elastic net method. IOP Conference Series Earth and Environmental Science, 354(1), 012112.
Google Scholar
Wang, F., He, T., & Nie, H. (2017). Power load prediction based on multiple linear regression model. Boletin Tecnico/Technical Bulletin, 55(07), 390–397.
Google Scholar
Wang, H. R., & Xian, Y. (2021). Optimal configuration of distributed generation based on sparrow search algorithm. IOP Conference Series: Earth Environmental Science, 647, 012053. https://doi.org/10.1088/1755-1315/647/1/012053
Google Scholar
Wang, Y., Chen, Q., Gan, D. H., Yang, J., Kirschen, D. S., Kang, C. (2019) Deep learning-based socio-demographic information identification from smart meter data. IEEE Transactions on Smart Grid, 10(3), 2593–2602. https://doi.org/10.1109/TSG.2018.2805723
Web of Science ®Google Scholar
Xue, J., & Shen, B. (2020). A novel swarm intelligence optimization approach: Sparrow search algorithm. Systems Science & Control Engineering An Open Access Journal, 8(1), 22–34. https://doi.org/10.1080/21642583.2019.1708830
Web of Science ®Google Scholar
Yan, R., Liao, J., Yang, J., Sun, W., Nong, M., & Li, F. (2020). Multi-hour and multi-site air quality index forecasting in Beijing using CNN, LSTM, CNN-LSTM, and spatiotemporal clustering. Expert Systems with Applications, 169, 114513.
Web of Science ®Google Scholar
Zheng, J., Xu, C., Zhang, Z., & Li, X. (2017). Electric load forecasting in smart grids using Long-Short-Term-Memory based Recurrent Neural Network. In Proceedings of the 51st Annual Conference on Information Sciences and Systems (CISS), Baltimore, MD, USA, 22–24 March 2017, pp. 1–6.
Google Scholar

Short-term power load forecasting using SSA-CNN-LSTM method

ABSTRACT

1. Introduction