Augmenting data-driven models for energy systems through feature engineering: A Python framework for feature engineering

01/04/2023
by   Sandra Wilfling, et al.
0

Data-driven modeling is an approach in energy systems modeling that has been gaining popularity. In data-driven modeling, machine learning methods such as linear regression, neural networks or decision-tree based methods are being applied. While these methods do not require domain knowledge, they are sensitive to data quality. Therefore, improving data quality in a dataset is beneficial for creating machine learning-based models. The improvement of data quality can be implemented through preprocessing methods. A selected type of preprocessing is feature engineering, which focuses on evaluating and improving the quality of certain features inside the dataset. Feature engineering methods include methods such as feature creation, feature expansion, or feature selection. In this work, a Python framework containing different feature engineering methods is presented. This framework contains different methods for feature creation, expansion and selection; in addition, methods for transforming or filtering data are implemented. The implementation of the framework is based on the Python library scikit-learn. The framework is demonstrated on a case study of a use case from energy demand prediction. A data-driven model is created including selected feature engineering methods. The results show an improvement in prediction accuracy through the engineered features.

READ FULL TEXT
research
01/22/2019

The autofeat Python Library for Automatic Feature Engineering and Selection

This paper describes the autofeat Python library, which provides a sciki...
research
05/09/2022

On Designing Data Models for Energy Feature Stores

The digitization of the energy infrastructure enables new, data driven, ...
research
07/25/2023

Integrating processed-based models and machine learning for crop yield prediction

Crop yield prediction typically involves the utilization of either theor...
research
07/10/2023

Code Generation for Machine Learning using Model-Driven Engineering and SysML

Data-driven engineering refers to systematic data collection and process...
research
03/01/2021

Knowledge-Guided Dynamic Systems Modeling: A Case Study on Modeling River Water Quality

Modeling real-world phenomena is a focus of many science and engineering...
research
08/10/2019

Robust data-driven approach for predicting the configurational energy of high entropy alloys

High entropy alloys (HEAs) have been increasingly attractive as promisin...
research
10/26/2021

Concepts for Automated Machine Learning in Smart Grid Applications

Undoubtedly, the increase of available data and competitive machine lear...

Please sign up or login with your details

Forgot password? Click here to reset