Boosting Sensitivity of Large-scale Online Experimentation via Dropout Buyer Imputation

09/09/2022
by   Sumin Shen, et al.
0

Metrics provide strong evidence to support hypotheses in online experimentation and hence reduce debates in the decision-making process. In this work, we introduce the concept of dropout buyers and categorize users with incomplete metric values into two groups: visitors and dropout buyers. For the analysis of incomplete metrics, we propose a cluster-based k-nearest neighbors-based imputation method. Our proposed imputation method considers both the experiment-specific features and users' activities along their shopping paths, allowing different imputation values for different users. To facilitate efficient imputation in large-scale data sets in online experimentation, the proposed method uses a combination of stratification and clustering. The performance of the proposed method was compared to several conventional methods in a past experiment at eBay.

READ FULL TEXT

page 18

page 19

research
08/28/2022

Leachable Component Clustering

Clustering attempts to partition data instances into several distinctive...
research
05/13/2020

Multiple Imputation for Biomedical Data using Monte Carlo Dropout Autoencoders

Due to complex experimental settings, missing values are common in biome...
research
07/06/2020

Does imputation matter? Benchmark for predictive models

Incomplete data are common in practical applications. Most predictive ma...
research
06/29/2023

Numerical Data Imputation for Multimodal Data Sets: A Probabilistic Nearest-Neighbor Kernel Density Approach

Numerical data imputation algorithms replace missing values by estimates...
research
06/09/2021

EMFlow: Data Imputation in Latent Space via EM and Deep Flow Models

High dimensional incomplete data can be found in a wide range of systems...
research
06/28/2022

No imputation without representation

By filling in missing values in datasets, imputation allows these datase...
research
04/07/2020

Learning Individual Models for Imputation (Technical Report)

Missing numerical values are prevalent, e.g., owing to unreliable sensor...

Please sign up or login with your details

Forgot password? Click here to reset