제라르도 온도미하
                     (Gerardo Ondo Micha)
                     1iD
                     김철환
                     (Chul-Hwan Kim)
                     †iD
               
                  - 
                           
                        (Dept. of Electrical and Computer Engineering Sungkyunkwan University, Korea.)
                        
 
            
            
            Copyright © The Korean Institute of Electrical Engineers(KIEE)
            
            
            
            
            
               
                  
Key words
               
               Distribution energy sources, ensemble regression, intermittency of renewable energy. power forecasting, Support Vector Machine.
             
            
          
         
            
                  1. Introduction
               Global warming has become a modern phenomenon as a result of the widespread use of
                  fossil fuels to produce energy. As a result, renewable energy resources could be a
                  feasible solution to this issue in terms of reducing carbon dioxide (CO2) emissions
                  and preserving pollution levels (1). Renewable energy sources such as solar, wind, hydro, tidal, geothermal, and biofuels
                  are all intriguing. The ecosystem will not be harmed by these pollution-free renewable
                  energy sources (2). Many scientists and researchers have looked into and determined the potential of
                  renewable energy sources including solar, wind, and hydropower (3).
                  
               
               Because of its clean, cost-free, and plentiful energy benefits, solar energy is one
                  of the most promising energy sources (4). As a result, solar power has a strong demand for power generation as part of efforts
                  to address these environmental problems. However, because of various environmental
                  factors such as ambient temperature, solar radiation, shadows, humidity, and wind
                  speed, the output power of photovoltaic (PV) panels is unpredictable and spontaneous.
                  These are some of the difficulties that grid operators must overcome to effectively
                  operate the power supply system (5). To address these issues, several strategies for balancing electric power consumption
                  and generation have been created. One of the techniques for improving the operating
                  reliability of power systems is to forecast demand on loads in the short term (6). 
                  
               
               Solar power forecasting is important for energy trading companies and power network
                  dispatching centers to make accurate decisions on critical issues such as power system
                  scheduling and operational control (7). Furthermore, accurate solar power forecasting increases the overall power system's
                  efficiency and power quality (8). Solar power forecasting is divided into two categories: direct and indirect forecasting.
                  Solar power data is predicted by direct forecasting as the model performance. Indirect
                  forecasting, on the other hand, produces the predicted values of solar radiation.
                  As a result, PV output models are using forecasted solar radiation values to calculate
                  solar power generation (9).
                  
               
               With recent advancements in Artificial Intelligence (AI)/Machine Learning (ML), load
                  forecasting can be performed considering the weather and atmospheric conditions yielding
                  higher accuracy as compared to conventional methods (10). In (11), individual forecasting methods are proved to have limited performance, low forecasting
                  accuracy, and high error. It is, therefore, necessary to develop combined algorithms
                  that would yield more robust performance and increase the forecasting accuracy of
                  models. This is achieved by using ensemble learning, where individual models serve
                  as weak or base learners and their predictions are combined in a more accurate predictive
                  model for classification or regression problems (12). Various ensemble learning algorithms are available in the literature, there are
                  three types of ensemble learning depending on the way base models or learners are
                  combined: bagging, boosting, and stacking (13). Different authors have applied the three algorithms separately to predict PV power
                  using time series data (14,15-16). Authors in (17) present a joint bagged-boosted ANN, the proposed ensemble model produces higher accuracy
                  prediction of short-term electricity as compared to bagged ANN and boosted ANN. Separate
                  STACK combinations were compared in (18), (19) and (20) to forecast solar energy and more accurate models were obtained. However, to the
                  best of our knowledge, the proposed algorithms do not consider one single Bagged-
                  Boosted STACK with a single meta-learner to forecast future energy values. 
                  
               
               This paper proposes a Bagged-Boosted STACK ensemble learning model with SVRL and seven
                  base learners are used: Elastic Net Regression, Random Forest, Linear Regression,
                  Lasso Regression, AdaBoost, GBoost (Gradient boost), and XGBoost (Extreme Gradient
                  Boost). Corr and PCA were used to pre-process and reduce time series data variance
                  and over-fit.
                  
               
               Bagging algorithms tend to decrease model variance whilst boosting focuses on reducing
                  bias error (21),  contributions of this work aim at solving the variance-bias tradeoff using the
                  STACK combination of both to filter out the generalizing biases and variances. To
                  our best knowledge, our combination of base learner models as well as the high number
                  of based learners used in our STACK has rarely been used to forecast PV output power.
                  The model proposed in this paper is flexible and can be used with other methods and
                  different base and super-learners, the scheme is compared with bagging, boosting and
                  bagging-boosting algorithms to prove the superiority of the proposed model. 
               
             
            
                  2. The Proposed Intelligent PV Power Forecasting Model
               The proposed model is a hybrid combination of seven different weak learners into one
                  final SVRL meta-predictor to forecast the output power of a PV system. Improved accuracy
                  of the forecasting model is achieved by combing bagging and boosting algorithms in
                  level 0 of the STACK, thus reducing variance and bias errors. 
               
               
                     2.1 Data Collection
                  The time series solar data of Gyeongnam, South Korea from January 2017 to June 2020
                     are obtained from a 350KW 3rd PV power plant. The dataset is comprised of 7 features
                     including UNIX date and time, temperature, wind direction, wind speed, rainfall, humidity
                     and PV output power. We carefully reviewed and the missing data were filled using
                     Bayesian Ridge Regression. In figure 1 the PV output profile is shown, (KW),from the year 2017 to 2020.  
                  
                  
                     
                     
                           
                           
Fig. 1. Photovoltaic energy production from 2017 to 2020.
                         
                     
                  
                
               
                     2.2 Feature Selection
                  The correlation between different data features is shown in figure 2. The correlation map shows the wind speed as the feature with the highest correlation
                     value (0.46) with the PV output power. A high regression coefficient is also depicted
                     for temperature (0.31) and the year, the rest of the features have a negative regression
                     coefficient, which means a lower correlation to PV energy production. Based on feature
                     statistics in Table 1 and the correlation map, five features were selected: wind speed, temperature, year,
                     number of days of the week, and wind direction. The pre-processed data is normalized
                     as in (1) to the scale of [-1, 1] allowing the model to converge faster and avoiding
                     very large weights to be assigned to features with larger scores in previous years
                     during training. PCA feature selection was implemented to further reduce our data
                     dimension into two principal uncorrelated components maintaining the most possible
                     information from the previous dataset. As shown in figure 2, the wind speed and temperature has the highest positive correlation with the PV
                     output, this means that if these features increase, the PV output power also increases,
                     if they decrease, the PV output power also decreases.
                     
                  
                  
                     
                     
                           
                           
Fig. 2. Feature correlation heat map.
                         
                     
                  
                  
                     
                     
                     
                     
                           
                           
Table 1. The proposed STACK model framework.
                        
                        
                           
                           
                           
                                 
                                    
                                       | Feature | Mean | Standard Deviation | Minimum Value | Maximum Value | 
                                 
                                       | Temperature | 13.330018 | 9.547295 | -13.5 | 35.1 | 
                                 
                                       | Wind Direction | 165.690331 | 124.355640 | 0.0 | 360.0 | 
                                 
                                       | Wind Speed | 2.348275 | 1.835690 | 0.0 | 13.6 | 
                                 
                                       | Rainfall | 0.164399 | 1.246811 | 0.0 | 68.5 | 
                                 
                                       | Humidity | 69.847010 | 21.563524 | 1.1 | 99.9 | 
                                 
                                       | Year | 2018.285043 | 1.029817 | 2017 | 2020 | 
                                 
                                       | Month | 6.09475 | 3.424773 | 1.0 | 12.0 | 
                                 
                                       | Day | 15.714174 | 8.792300 | 1.0 | 31.0 | 
                                 
                                       | Day of Week number | 2.998434 | 2.001989 | 0.0 | 6.0 | 
                              
                           
                        
                      
                     
                  
                
               
                     2.3 The Framework
                  The framework structure in Figure 3 shows the steps developed in this paper to implement the applied methodology. The
                     first step was time-series raw data pre-processing and feature statistics were analyzed,
                     followed by data split into training and test, 80% and 20% respectively. The next
                     step was data scaling and feature selection using CORR and PCA. The next step was
                     the training of base models in level 0 of the STACK, predictions made by bagging and
                     boosting models were combined and used as input for the SVRL meta-learner in level
                     1. The last step was the evaluation of the trained Stacking model using a testing
                     dataset and different metrics were used to compare the proposed model with bagging,
                     boosting, and bagging- boosting models separately.
                  
                  
                     
                     
                           
                           
Fig. 3. The proposed STACK model framework.
                         
                     
                  
                  The proposed model is a hybrid combination of seven different weak learners into one
                     final SVRL meta-predictor to forecast the output power of a PV system. Improved accuracy
                     of the forecasting model is achieved by combing bagging and boosting algorithms in
                     level 0 of the STACK. This is achieved by averaging the predictions made by each and
                     every weak learner, thus reducing variance and bias errors. 
                  
                  A support vector machine is a machine learning method that turns the problem into
                     linear by transformations of the original space to higher-dimensional spaces employing
                     a kernel $K\left(X_{n},\: X_{n^{1}}\right)$ $K\left(X_{n},\: X_{n^{1}}\right)$.
                     
                     
                  
                  In regression, the error is minimized by eliminating the penalty around $\vec{\alpha}$
                     ± ɛ interval. So that models do not fall into over- fitting, a certain error is admitted
                     in the data, which is marked by the hyper-parameter C.
                     
                     
                  
                  By combining bagging and boosting, the total forecasting error will decrease significantly.
                     The total error of a model can be decomposed as bias + variance + error. In bagging,
                     models with very little bias but a high variance are used, adding them reduces the
                     variance without just inflating the bias. In boosting, models with a very little variance
                     but high bias are used, adjusting the models sequentially reduces the bias. Therefore,
                     each of the strategies reduces a part of the total error of the STACK.
                     
                     
                  
                  In bagging, each model is different from the rest because each one is trained with
                     a different sample obtained by bootstrapping. In boosting, the models have adjusted
                     sequentially and the importance (weight) of the observations changes with each iteration,
                     leading to different adjustments.
                  
                
               
                     2.4 Algorithm
                  Given $M$ models in level 0 of the STACK, $h$ base regressors, $h^{new}$ meta-regressor
                     and a training dataset, $D$:
                  
                  ·For $D =(x_{i}-y_{i})|x_{i}\epsilon X,\: y_{i}\epsilon Y$
                     
                  
                  · For $t = 1$to $T$, learn  base regressors for bagging and boosting based on $D$.
                     
                  
                  · Construct a new dataset from $D$.
                     
                  
                  · For $i = 1$ to $m$, construct a new dataset {$x_{i}^{new},\:y_{i}$}; 
                     
                  
                  · Where {$x_{i}^{new}=h_{i}(x_{i}) {for}j = 1$ to $T$$$};  
                     
                  
                  · Learn the meta-classifier $h_{new}$ based on a new dataset {$x_{i}^{new},\:y_{i}$};
                     
                     
                  
                  · Return $H_{(x)}=h_{new}(h_{1(x)},\:h_{2(x)},\:...,\:h_{T(x)})$   
                  A different set of Training data $D=(x_{i}-y_{i})$ was collected and lineal Kernel
                     was used. The correlation matrix was formed as in (1).
                  
                  
                     
                     
                     
                     
                     
                  
                  Where $\varepsilon$ represents the violation concept and the correlation vector $K$
                     is used to compute the concentration coefficient, $\vec{\alpha}$. $\vec{y}$ contains
                     all the values corresponding to $D$.
                  
                  
                     
                     
                     
                     
                     
                  
                  $\vec{\alpha}$ is used to create the estimator for our model and maintain the forecasting
                     error of the STACK model below the threshold (23).
                  
                
             
            
                  3. Performance Evaluation
               Given that $y$ is the actual value, $\widehat{y}$ is the predicted value, $n$ is the
                  number of data samples and represents the variance, the following metrics were used
                  to evaluate our model performance:
                  
                  
               
               Explained Variance Score (EVS): is used to measure the variability between $y$ and
                  $\widehat{y}$, the ideal value of EVS is 1:
               
               
                  
                  
                  
                  
                  
               
               Mean Absolute Error (MAE): measures the absolute error between the predicted value
                  and the actual value. It shows how big the forecast error is on average.
               
               
                  
                  
                  
                  
                  
               
               Mean Squared Error (MSE): measures how close data points are located from the fitted
                  line.
               
               
                  
                  
                  
                  
                  
               
               Root Mean Squared Error (RMSE): measures how far data points are located from the
                  regression line. 
               
               
                  
                  
                  
                  
                  
               
               Determination Coefficient $(R^{2})$: $R^{2}$  is the percentage of variation of the
                  predicted value that explains its relationship with one or more predictor variables.
                  Generally, the higher the $R^{2}$, the better the model fits the given data.
               
               
                  
                  
                  
                  
                  
               
               
                  
                  
                  
                  
                  
               
             
            
                  4. Results and Discussion
               This section shows the comparison of performance among the proposed Bagging-Boosting
                  STACK SVRL, bagging, boosting, and a combination of bagging-boosting models using
                  Elastic Net Regression, Random Forest, Linear Regression, Lasso Regression, AdaBoost,
                  Gboost, and XGBoost as base learners of the STACK, the simulations and coding were
                  performed using Python. On the plot, the blue dotted line represents the performance
                  of the training set made of 80% of the original data set, whilst the green line represents
                  the model performance on the 20% testing dataset. In figure 4, the bagging model shows a reduced bias and variance as compared to boosting in figure 5. The lowest bias error is achieved by combining bagging and boosting algorithms together
                  as shown in figure 6, whilst the proposed STACK model in figure 7 shows the lowest variance and bias errors. 
               
               
                  
                  
                        
                        
Fig. 4. Bagging Model Performance Evaluation.
                      
                  
               
               
                  
                  
                        
                        
Fig. 5. Boosting Model Performance Evaluation.
                      
                  
               
               
                  
                  
                        
                        
Fig. 6. Bagging-Boosting Model Performance Evaluation
                      
                  
               
               
                  
                  
                        
                        
Fig. 7. Bagging-Boosting STACK Model Performance
                      
                  
               
               In figure 8, the different weak learners used in level 0 of our STACK model were compared by
                  five different metrics and observations show the most robust performance for the Random
                  Forest model, with a value of 0.25% for both  and EVS. Figure 9 shows the results of the comparison among selected algorithms. The proposed bagging-boosting
                  STACK shows better overall performance by solving the bias-variance tradeoff. The
                  metrics used in assessing the four models show the lowest forecasting error for the
                  STACK in comparison with the other algorithms, hence yielding the best forecasting
                  ability.
               
               
                  
                  
                        
                        
Fig. 8. Comparison of Error Metrics among the base learners
                      
                  
               
               
                  
                  
                        
                        
Fig. 9. Comparison of Error Metrics with Existing Models
                      
                  
               
             
            
                  5. Conclusion
               In this work, we propose a bagging-boosting STACK model with different algorithms
                  used in solar power prediction. The proposed STACK uses an SVRL as a meta-learner
                  to forecast PV output and seven different weak learners are used to provide input
                  prediction to the meta-learner. The proposed STACK outperforms the predictions made
                  by bagging, boosting, and bagging-boosting models separately. By using the proposed
                  model, the tradeoff between variance and bias is significantly reduced. Bagging reduces
                  the variance of the weak learners whilst boosting reduces their bias, by combining
                  both algorithms into one STACK model, the generalization and forecasting errors are
                  reduced. The developed methodology showed that the prediction error of PV output power
                  can be reduced, however more data is required to training the models as shown by the
                  results. The proposed STACK model can be improved through testing different base learners
                  and meta-learners, but also by increasing the number of layers of the STACK. 
               
             
          
         
            
                  Acknowledgements
               
                  This work has been supported by the National Research Foundation of Korea (NRF) grant
                  funded by the Korean government (MSIP) (No. 2021R1A2B5B03086257).
                  
               
             
            
                  
                     References
                  
                     
                        
                        M.K. Behera, I. Majumder, N. Nayak, 2018, Solar photovoltaic power forecasting using
                           optimized modified extreme learning machine technique, Engineering Science and Technology,
                           Vol. an international journal, pp. 21:428-438

 
                     
                        
                        P. Dawan, K. Sriprapha, S. Kittisontirak, T. Boonraksa, N. Junhuathon, W. Titiroongruang,
                           S. Niemcharoen, 2020, Comparison of power output forecasting on the photovoltaic system
                           using adaptive neuro-fuzzy inference systems and particle swarm optimization-artificial
                           neural network model, Energies, Vol. 13:351

 
                     
                        
                        Y. Zhang, J. Ren, Y. Pu, P. Wang, 2020, Solar energy potential assessment: A framework
                           to integrate geographic, technological, and economic indices for a potential analysis,
                           Renewable Energy, Vol. 149, pp. 577-586

 
                     
                        
                        Y.K. Semero, J. Zhang, D. Zheng, 2018, Pv power forecasting using an integrated ga-pso-anfis
                           approach and gaussian process regression based feature selection strategy, CSEE Journal
                           of Power and Energy Systems. 4:210-218, pp. 4:210-218

 
                     
                        
                        M. Zamo, O. Mestre, P. Arbogast, 2014, A benchmark of statistical regression methods
                           for short-term forecasting of photovoltaic electricity production, part i: Deterministic
                           forecast of hourly production, Solar Energy., pp. 105:792-803

 
                     
                        
                        F. Rodríguez, A. Fleetwood, A. Galarza, L. Fontán, 2018, Predicting solar energy generation
                           through artificial neural networks using weather forecasts for microgrid control,
                           Renewable Energy, pp. 126:855-864

 
                     
                        
                        A.T. Eseye, J. Zhang, D. Zheng, 2018, Short-term photovoltaic solar power forecasting
                           using a hybrid wavelet- pso-svm model based on scada and meteorological information.,
                           Renewable Energy., pp. 118:357-367

 
                     
                        
                        J. Shi, W.-J. Lee, Y. Liu, Y. Yang, P. Wang, 2012, Forecasting power output of photovoltaic
                           systems based on weather classification and support vector machines, IEEE Transactions
                           on Industry Applications, pp. 48:1064-1069

 
                     
                        
                         Huang,  C., L. Cao, N. Peng, S. Li, J. Zhang, L. Wang, X. Luo, J.-H. Wang, 2018,
                           Day-ahead forecasting of hourly photovoltaic power based on robust multilayer perception,
                           Sustainability, Vol. 10:4863

 
                     
                        
                        S. Kittisontirak, P. Dawan, N. Atiwongsangthong, W. Titiroongruang, P. Chinnavornrungsee,
                           A. Hongsingthong, K. Sriprapha, P. Manosukritkul, 2017, A novel power output model
                           for photovoltaic system

 
                     
                        
                        S.M. Jung, S. Park, S.W. Jung, E Hwang, 2020, Monthly Electric Load Forecasting Using
                           Transfer Learning for Smart Cities, Sustainability, Vol. 12, No. 16, pp. 6364

 
                     
                        
                        C.E. Borges, Y.K. Penya, I. Fernandez, 2012, Evaluating combined load forecasting
                           in large power systems and smart grids, IEEE Transactions on Industrial Informatics,
                           Vol. 9, No. 3, pp. 1570-1577

 
                     
                        
                        M. Leutbecher, T. N Palmer, Ensemble forecasting, Journal of computational physics,
                           Vol. 227, No. 7, pp. 3515-3539

 
                     
                        
                         Zenko,  B.,  Todorovski,  L.,  Dzeroski,  S, November 2001, A comparison of stacking
                           with meta decision trees to bagging, boosting, and stacking with other methods., In
                           Proceedings 2001 IEEE International Conference on Data Mining, pp. 669-670

 
                     
                        
                        W. El-Baz, P. Tzscheutschler, U Wagner, 2018, Day-ahead probabilistic PV generation
                           forecast for buildings energy management systems, Solar Energy, Vol. 171, pp. 478-490

 
                     
                        
                        C. Persson, P. Bacher, T. Shiga, H Madsen, 2017, Multi- site solar power forecasting
                           using gradient boosted regression trees, Solar Energy, Vol. 150, pp. 423-436

 
                     
                        
                        H. Zhou, Y. Zhang, L. Yang, Q Liu, October 2018, Short-term photovoltaic power forecasting
                           based on Stacking-SVM., In 2018 9th International Conference on Information Technology
                           in Medicine and Education (ITME), pp. 994-998

 
                     
                        
                        A. S. Khwaja, A. Anpalagan, M. Naeem, B. Venkatesh, Joint bagged-boosted artificial
                           neural networks: Using ensemble machine learning to improve short-term electricity
                           load forecasting, Electric Power Systems Research, 179, Vol. 106080

 
                     
                        
                        N. Fraccanabbia, R. G. da Silva, M. H. D. M. Ribeiro, S. R. Moreno, L. dos Santos
                           Coelho, V. C Mariani, July 2020, Solar Power Forecasting Based on Ensemble Learning
                           Methods, In 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1-7

 
                     
                        
                        S. R. Moreno, R. G. da Silva, M. H. D. M. Ribeiro, N. Fraccanabbia, V. C. Mariani,
                           L. D. S. Coelho, Belem Brazil, November 2019, Very short-term wind energy forecasting
                           based on stacking ensemble, In 14th Brazilian Computational Intelligence Meeting (CBIC),
                           pp. 1-7

 
                     
                        
                        S. Choi, J. Hur, 2020, An ensemble learner-based bagging model using past output data
                           for photovoltaic forecasting, Energies, Vol. 13, No. 6, pp. 1438

 
                     
                        
                        X. Luo, J. Sun, L. Wang, W. Wang, W. Zhao, J. Wu, Z. Zhang, 2018, Short-term wind
                           speed forecasting via stacked extreme learning machine with generalized correntropy,
                           IEEE Transactions on Industrial Informatics, Vol. 14, No. 11, pp. 4963-4971

 
                     
                        
                        L. Breiman, 1996, Stacked regressions, Machine learning, Vol. 24, No. 1, pp. 49-64

 
                   
                
             
            저자소개
             
             
             
            
                  제라르도 온도 미하(Gerardo Ondo Micha)
 
            He received a B.S degree in Electrical and Electronics Engineering from University
               Teknologi Petronas, Tronoh, Malaysia in 2017.
            
             At present, he is enrolled in master's degree program at Sungkyunkwan University.
             His research interests include Intermittency of Renewable Energies, power system
               protection, islanding detection, hosting capacity, auto-reclosing schemes in AC, DC,
               and Hybrid transmission lines, and artificial intelligence applications for the power
               system.
            
            
            He received the B.S., M.S., and Ph.D. degrees in electrical engineering from Sungkyunkwan
               University, Suwon, Korea, in 1982, 1984, and 1990, respectively.
            
             In 1990, he joined Jeju National University, Jeju, Korea, as a Full- Time Lecturer.
             He was a Visiting Academic with the University of Bath, Bath, U.K., in 1996, 1998,
               and 1999.
            
             He has been a Professor with the College of Information and Communication Engineering,
               Sungkyunkwan University, since 1992, where he is currently the Director of the Center
               for Power Information Technology.
            
             His current research interests include power system protection, artificial intelligence
               applications for protection and control, modeling and protection of microgrid and
               DC system.