Weakly Supervised Learning Meets Ride-Sharing User Experience Enhancement

01/19/2020
by   Lan-Zhe Guo, et al.
0

Weakly supervised learning aims at coping with scarce labeled data. Previous weakly supervised studies typically assume that there is only one kind of weak supervision in data. In many applications, however, raw data usually contains more than one kind of weak supervision at the same time. For example, in user experience enhancement from Didi, one of the largest online ride-sharing platforms, the ride comment data contains severe label noise (due to the subjective factors of passengers) and severe label distribution bias (due to the sampling bias). We call such a problem as "compound weakly supervised learning". In this paper, we propose the CWSL method to address this problem based on Didi ride-sharing comment data. Specifically, an instance reweighting strategy is employed to cope with severe label noise in comment data, where the weights for harmful noisy instances are small. Robust criteria like AUC rather than accuracy and the validation performance are optimized for the correction of biased data label. Alternating optimization and stochastic gradient methods accelerate the optimization on large-scale data. Experiments on Didi ride-sharing comment data clearly validate the effectiveness. We hope this work may shed some light on applying weakly supervised learning to complex real situations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2022

Label Propagation with Weak Supervision

Semi-supervised learning and weakly supervised learning are important pa...
research
04/22/2019

Reliable Weakly Supervised Learning: Maximize Gain and Maintain Safeness

Weakly supervised data are widespread and have attracted much attention....
research
07/10/2023

Onion Universe Algorithm: Applications in Weakly Supervised Learning

We introduce Onion Universe Algorithm (OUA), a novel classification meth...
research
03/03/2022

Learning Selection Bias and Group Importance: Differentiable Reparameterization for the Hypergeometric Distribution

Partitioning a set of elements into a given number of groups of a priori...
research
05/26/2020

Learning with Weak Supervision for Email Intent Detection

Email remains one of the most frequently used means of online communicat...
research
03/19/2019

Cross-task weakly supervised learning from instructional videos

In this paper we investigate learning visual models for the steps of ord...
research
10/19/2020

Importance Reweighting for Biquality Learning

The field of Weakly Supervised Learning (WSL) has recently seen a surge ...

Please sign up or login with your details

Forgot password? Click here to reset