Effective electrical load forecasting is based on the quality of historical data and the efficiency of forecasting algorithms. However, the presence of the missing data, due to sensor errors, communication failures, and data processing anomalies, is one of the significant problem, which not only compromising the integrity of the dataset but also reduces the accuracy of forecasting. Machine learning (ML) based imputation techniques are significant in addressing this issue by estimating and substituting the missing values based on the inherent correlations present within the dataset. In this study, four ML based imputation approaches, i.e., Random Forest (RF), Support Vector Regression (SVR), K-Nearest Neighbors (KNN) and Extreme Gradient Boosting (XGBoost), are applied to enhance the accuracy and reliability of the electrical load forecasting. A synthetic linear missing data pattern is introduced into the original dataset, and these imputation methods are evaluated for their effectiveness in restoring data integrity. This task is achieved by integrating the imputed datasets into two deep learning (DL) forecasting frameworks: Recurrent Neural Network (RNN) and Gated Recurrent Unit (GRU). The predictive performance is measured through metric parameters including Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R-squared (R2), along with an analysis of computational efficiency. The comparative study between DL structures indicates that the RNN requires less computational time, although the GRU consistently delivers superior forecasting accuracy across all imputation methods. Considering the evaluated imputation techniques, the XGBoost perform better at the lowest MSE with 6% missing data (894.98 with RNN; 876.62 with GRU), while the RF is the most consistent, particularly, at higher missing data rates (MSE: 1259.17 at 30% missingness). These findings highlight the critical significance of selecting suitable imputation techniques to enhance load forecasting efficacy in practical applications.

(2025). Machine Learning-Based Imputation Approaches for Efficient Electrical Load Forecasting . Retrieved from https://hdl.handle.net/10446/313065

Machine Learning-Based Imputation Approaches for Efficient Electrical Load Forecasting

Hussain, Ayaz;Giangrande, Paolo;Franchini, Giuseppe;
2025-01-01

Abstract

Effective electrical load forecasting is based on the quality of historical data and the efficiency of forecasting algorithms. However, the presence of the missing data, due to sensor errors, communication failures, and data processing anomalies, is one of the significant problem, which not only compromising the integrity of the dataset but also reduces the accuracy of forecasting. Machine learning (ML) based imputation techniques are significant in addressing this issue by estimating and substituting the missing values based on the inherent correlations present within the dataset. In this study, four ML based imputation approaches, i.e., Random Forest (RF), Support Vector Regression (SVR), K-Nearest Neighbors (KNN) and Extreme Gradient Boosting (XGBoost), are applied to enhance the accuracy and reliability of the electrical load forecasting. A synthetic linear missing data pattern is introduced into the original dataset, and these imputation methods are evaluated for their effectiveness in restoring data integrity. This task is achieved by integrating the imputed datasets into two deep learning (DL) forecasting frameworks: Recurrent Neural Network (RNN) and Gated Recurrent Unit (GRU). The predictive performance is measured through metric parameters including Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R-squared (R2), along with an analysis of computational efficiency. The comparative study between DL structures indicates that the RNN requires less computational time, although the GRU consistently delivers superior forecasting accuracy across all imputation methods. Considering the evaluated imputation techniques, the XGBoost perform better at the lowest MSE with 6% missing data (894.98 with RNN; 876.62 with GRU), while the RF is the most consistent, particularly, at higher missing data rates (MSE: 1259.17 at 30% missingness). These findings highlight the critical significance of selecting suitable imputation techniques to enhance load forecasting efficacy in practical applications.
2025
Inglese
2025 IEEE 13th International Conference on Smart Energy Grid Engineering, SEGE 2025
979-8-3315-8592-1
67
72
online
United States
Piscataway
IEEE (Institute of Electrical and Electronics Engineers)
esperti anonimi
SEGE 2025: 13th IEEE International Conference on Smart Energy Grid Engineering, Oshawa, Canada, 18 -20 August 2025
13th
Oshawa, Canada
18 -20 August 2025
IEEE
Toronto Section NPS Chapter
internazionale
contributo
Settore IIND-08/A - Convertitori, macchine e azionamenti elettrici
Deep Learning (DL); Extreme Gradient Boosting (XGBoost); Gated Recurrent Unit (GRU); KNN (K-Nearest Neighbors); Machine Learning (ML); Random Forest (RF); Recurrent Neural Network (RNN); Support Vector Regression (SVR)
https://ieeexplore.ieee.org/document/11203375
info:eu-repo/semantics/conferenceObject
5
Hussain, Ayaz; Giangrande, Paolo; Franchini, Giuseppe; Fenili, Lorenzo; Messi, Silvio
1.4 Contributi in atti di convegno - Contributions in conference proceedings::1.4.01 Contributi in atti di convegno - Conference presentations
reserved
Non definito
273
(2025). Machine Learning-Based Imputation Approaches for Efficient Electrical Load Forecasting . Retrieved from https://hdl.handle.net/10446/313065
File allegato/i alla scheda:
File Dimensione del file Formato  
C103_merged.pdf

Solo gestori di archivio

Versione: publisher's version - versione editoriale
Licenza: Licenza default Aisberg
Dimensione del file 1.56 MB
Formato Adobe PDF
1.56 MB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/313065
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact