Feature Selection Methods for Uplift Modeling

by   Zhenyu Zhao, et al.

Uplift modeling is a predictive modeling technique that estimates the user-level incremental effect of a treatment using machine learning models. It is often used for targeting promotions and advertisements, as well as for the personalization of product offerings. In these applications, there are often hundreds of features available to build such models. Keeping all the features in a model can be costly and inefficient. Feature selection is an essential step in the modeling process for multiple reasons: improving the estimation accuracy by eliminating irrelevant features, accelerating model training and prediction speed, reducing the monitoring and maintenance workload for feature data pipeline, and providing better model interpretation and diagnostics capability. However, feature selection methods for uplift modeling have been rarely discussed in the literature. Although there are various feature selection methods for standard machine learning models, we will demonstrate that those methods are sub-optimal for solving the feature selection problem for uplift modeling. To address this problem, we introduce a set of feature selection methods designed specifically for uplift modeling, including both filter methods and embedded methods. To evaluate the effectiveness of the proposed feature selection methods, we use different uplift models and measure the accuracy of each model with a different number of selected features. We use both synthetic and real data to conduct these experiments. We also implemented the proposed filter methods in an open source Python package (CausalML).


page 1

page 7


Maximum Relevance and Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform

In machine learning applications for online product offerings and market...

Powershap: A Power-full Shapley Feature Selection Method

Feature selection is a crucial step in developing robust and powerful ma...

Forward and Backward Feature Selection for Query Performance Prediction

The goal of query performance prediction (QPP) is to automatically estim...

Feature Selection using e-values

In the context of supervised parametric models, we introduce the concept...

Powerful Knockoffs via Minimizing Reconstructability

Model-X knockoffs allows analysts to perform feature selection using alm...

Incremental personalized E-mail spam filter using novel TFDCR feature selection with dynamic feature update

Communication through e-mails remains to be highly formalized, conventio...

Filter Methods for Feature Selection in Supervised Machine Learning Applications – Review and Benchmark

The amount of data for machine learning (ML) applications is constantly ...

Please sign up or login with your details

Forgot password? Click here to reset