The autofeat Python Library for Automatic Feature Engineering and Selection

01/22/2019
by   Franziska Horn, et al.
0

This paper describes the autofeat Python library, which provides a scikit-learn style linear regression model with automatic feature engineering and selection capabilities. Complex non-linear machine learning models such as neural networks are in practice often difficult to train and even harder to explain to non-statisticians, who require transparent analysis results as a basis for important business decisions. While linear models are efficient and intuitive, they generally provide lower prediction accuracies. Our library provides a multi-step feature engineering and selection process, where first a large pool of non-linear features is generated, from which then a small and robust set of meaningful features is selected, which improve the prediction accuracy of a linear model while retaining its interpretability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/04/2023

Augmenting data-driven models for energy systems through feature engineering: A Python framework for feature engineering

Data-driven modeling is an approach in energy systems modeling that has ...
research
11/22/2016

Feature Importance Measure for Non-linear Learning Algorithms

Complex problems may require sophisticated, non-linear learning methods ...
research
10/19/2021

abess: A Fast Best Subset Selection Library in Python and R

We introduce a new library named abess that implements a unified framewo...
research
03/07/2017

Regularising Non-linear Models Using Feature Side-information

Very often features come with their own vectorial descriptions which pro...
research
12/07/2020

Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately

The presence of spurious features interferes with the goal of obtaining ...
research
01/29/2021

A principle feature analysis

A key task of data science is to identify relevant features linked to ce...
research
03/21/2022

Delicatessen: M-Estimation in Python

M-estimation is a general statistical approach that simplifies and unifi...

Please sign up or login with your details

Forgot password? Click here to reset