QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query Attribute Value Extraction

by   Danqing Zhang, et al.

We study the problem of query attribute value extraction, which aims to identify named entities from user queries as diverse surface form attribute values and afterward transform them into formally canonical forms. Such a problem consists of two phases: named entity recognition (NER) and attribute value normalization (AVN). However, existing works only focus on the NER phase but neglect equally important AVN. To bridge this gap, this paper proposes a unified query attribute value extraction system in e-commerce search named QUEACO, which involves both two phases. Moreover, by leveraging large-scale weakly-labeled behavior data, we further improve the extraction performance with less supervision cost. Specifically, for the NER phase, QUEACO adopts a novel teacher-student network, where a teacher network that is trained on the strongly-labeled data generates pseudo-labels to refine the weakly-labeled data for training a student network. Meanwhile, the teacher network can be dynamically adapted by the feedback of the student's performance on strongly-labeled data to maximally denoise the noisy supervisions from the weak labels. For the AVN phase, we also leverage the weakly-labeled query-to-attribute behavior data to normalize surface form attribute values from queries into canonical forms from products. Extensive experiments on a real-world large-scale E-commerce dataset demonstrate the effectiveness of QUEACO.


page 1

page 2

page 3

page 4


Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data

Weak supervision has shown promising results in many natural language pr...

SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data

We present SwellShark, a framework for building biomedical named entity ...

Distantly-Supervised Named Entity Recognition with Adaptive Teacher Learning and Fine-grained Student Ensemble

Distantly-Supervised Named Entity Recognition (DS-NER) effectively allev...

CL-NERIL: A Cross-Lingual Model for NER in Indian Languages

Developing Named Entity Recognition (NER) systems for Indian languages h...

Noisy-Labeled NER with Confidence Estimation

Recent studies in deep learning have shown significant progress in named...

Gradient Imitation Reinforcement Learning for General Low-Resource Information Extraction

Information Extraction (IE) aims to extract structured information from ...

Knowledge-Enhanced Multi-Label Few-Shot Product Attribute-Value Extraction

Existing attribute-value extraction (AVE) models require large quantitie...

Please sign up or login with your details

Forgot password? Click here to reset