Study of Encoder-Decoder Architectures for Code-Mix Search Query Translation

08/07/2022
by   Mandar Kulkarni, et al.
0

With the broad reach of the internet and smartphones, e-commerce platforms have an increasingly diversified user base. Since native language users are not conversant in English, their preferred browsing mode is their regional language or a combination of their regional language and English. From our recent study on the query data, we noticed that many of the queries we receive are code-mix, specifically Hinglish i.e. queries with one or more Hindi words written in English (Latin) script. We propose a transformer-based approach for code-mix query translation to enable users to search with these queries. We demonstrate the effectiveness of pre-trained encoder-decoder models trained on a large corpus of the unlabeled English text for this task. Using generic domain translation models, we created a pseudo-labelled dataset for training the model on the search queries and verified the effectiveness of various data augmentation techniques. Further, to reduce the latency of the model, we use knowledge distillation and weight quantization. Effectiveness of the proposed method has been validated through experimental evaluations and A/B testing. The model is currently live on Flipkart app and website, serving millions of queries.

READ FULL TEXT
research
08/07/2022

Vernacular Search Query Translation with Unsupervised Domain Adaptation

With the democratization of e-commerce platforms, an increasingly divers...
research
06/27/2023

Constructing Multilingual Code Search Dataset Using Neural Machine Translation

Code search is a task to find programming codes that semantically match ...
research
07/15/2023

Intuitive Access to Smartphone Settings Using Relevance Model Trained by Contrastive Learning

The more new features that are being added to smartphones, the harder it...
research
03/01/2021

Query Rewriting via Cycle-Consistent Translation for E-Commerce Search

Nowadays e-commerce search has become an integral part of many people's ...
research
05/28/2018

Graph-based Filtering of Out-of-Vocabulary Words for Encoder-Decoder Models

Encoder-decoder models typically only employ words that are frequently u...
research
07/02/2019

Learning to Reformulate the Queries on the WEB

Inability of the naive users to formulate appropriate queries is a funda...
research
10/06/2020

Incorporating Behavioral Hypotheses for Query Generation

Generative neural networks have been shown effective on query suggestion...

Please sign up or login with your details

Forgot password? Click here to reset