A Nonparametric Contextual Bandit with Arm-level Eligibility Control for Customer Service Routing

09/08/2022
by   Ruofeng Wen, et al.
4

Amazon Customer Service provides real-time support for millions of customer contacts every year. While bot-resolver helps automate some traffic, we still see high demand for human agents, also called subject matter experts (SMEs). Customers outreach with questions in different domains (return policy, device troubleshooting, etc.). Depending on their training, not all SMEs are eligible to handle all contacts. Routing contacts to eligible SMEs turns out to be a non-trivial problem because SMEs' domain eligibility is subject to training quality and can change over time. To optimally recommend SMEs while simultaneously learning the true eligibility status, we propose to formulate the routing problem with a nonparametric contextual bandit algorithm (K-Boot) plus an eligibility control (EC) algorithm. K-Boot models reward with a kernel smoother on similar past samples selected by k-NN, and Bootstrap Thompson Sampling for exploration. EC filters arms (SMEs) by the initially system-claimed eligibility and dynamically validates the reliability of this information. The proposed K-Boot is a general bandit algorithm, and EC is applicable to other bandits. Our simulation studies show that K-Boot performs on par with state-of-the-art Bandit models, and EC boosts K-Boot performance when stochastic eligibility signal exists.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2019

Nonparametric Contextual Bandits in an Unknown Metric Space

Consider a nonparametric contextual multi-arm bandit problem where each ...
research
02/23/2018

Contextual Bandits with Stochastic Experts

We consider the problem of contextual bandits with stochastic experts, w...
research
02/23/2022

Residual Bootstrap Exploration for Stochastic Linear Bandit

We propose a new bootstrap-based online algorithm for stochastic linear ...
research
02/24/2019

AgentBuddy: A Contextual Bandit based Decision Support System for Customer Support Agents

In this short paper, we present early insights from a Decision Support S...
research
03/07/2020

Online Residential Demand Response via Contextual Multi-Armed Bandits

Residential load demands have huge potential to be exploited to enhance ...
research
06/21/2020

An Opportunistic Bandit Approach for User Interface Experimentation

Facing growing competition from online rivals, the retail industry is in...
research
01/31/2022

Evaluating Deep Vs. Wide Deep Learners As Contextual Bandits For Personalized Email Promo Recommendations

Personalization enables businesses to learn customer preferences from pa...

Please sign up or login with your details

Forgot password? Click here to reset