SAFE: Scalable Automatic Feature Engineering Framework for Industrial Tasks

by   Qitao Shi, et al.
Ant Financial

Machine learning techniques have been widely applied in Internet companies for various tasks, acting as an essential driving force, and feature engineering has been generally recognized as a crucial tache when constructing machine learning systems. Recently, a growing effort has been made to the development of automatic feature engineering methods, so that the substantial and tedious manual effort can be liberated. However, for industrial tasks, the efficiency and scalability of these methods are still far from satisfactory. In this paper, we proposed a staged method named SAFE (Scalable Automatic Feature Engineering), which can provide excellent efficiency and scalability, along with requisite interpretability and promising performance. Extensive experiments are conducted and the results show that the proposed method can provide prominent efficiency and competitive effectiveness when comparing with other methods. What's more, the adequate scalability of the proposed method ensures it to be deployed in large scale industrial tasks.


page 1

page 2

page 3

page 4


AEFE: Automatic Embedded Feature Engineering for Categorical Features

The challenge of solving data mining problems in e-commerce applications...

On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models

For industrial-scale advertising systems, prediction of ad click-through...

AutoSmart: An Efficient and Automatic Machine Learning framework for Temporal Relational Data

Temporal relational data, perhaps the most commonly used data type in in...

Time is of the Essence: Machine Learning-based Intrusion Detection in Industrial Time Series Data

The Industrial Internet of Things drastically increases connectivity of ...

SmartChoices: Augmenting Software with Learned Implementations

We are living in a golden age of machine learning. Powerful models are b...

A Composable Framework for Policy Design, Learning, and Transfer Toward Safe and Efficient Industrial Insertion

Delicate industrial insertion tasks (e.g., PC board assembly) remain cha...

Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

CTR prediction in real-world business is a difficult machine learning pr...

Please sign up or login with your details

Forgot password? Click here to reset