Improving Content Retrievability in Search with Controllable Query Generation

03/21/2023
by   Gustavo Penha, et al.
0

An important goal of online platforms is to enable content discovery, i.e. allow users to find a catalog entity they were not familiar with. A pre-requisite to discover an entity, e.g. a book, with a search engine is that the entity is retrievable, i.e. there are queries for which the system will surface such entity in the top results. However, machine-learned search engines have a high retrievability bias, where the majority of the queries return the same entities. This happens partly due to the predominance of narrow intent queries, where users create queries using the title of an already known entity, e.g. in book search 'harry potter'. The amount of broad queries where users want to discover new entities, e.g. in music search 'chill lyrical electronica with an atmospheric feeling to it', and have a higher tolerance to what they might find, is small in comparison. We focus here on two factors that have a negative impact on the retrievability of the entities (I) the training data used for dense retrieval models and (II) the distribution of narrow and broad intent queries issued in the system. We propose CtrlQGen, a method that generates queries for a chosen underlying intent-narrow or broad. We can use CtrlQGen to improve factor (I) by generating training data for dense retrieval models comprised of diverse synthetic queries. CtrlQGen can also be used to deal with factor (II) by suggesting queries with broader intents to users. Our results on datasets from the domains of music, podcasts, and books reveal that we can significantly decrease the retrievability bias of a dense retrieval model when using CtrlQGen. First, by using the generated queries as training data for dense models we make 9 non-zero retrievability). Second, by suggesting broader queries to users, we can make 12

READ FULL TEXT
research
09/06/2019

Context-aware Deep Model for Entity Recommendation in Search Engine at Alibaba

Entity recommendation, providing search users with an improved experienc...
research
05/02/2022

Entity-aware Transformers for Entity Search

Pre-trained language models such as BERT have been a key ingredient to a...
research
10/08/2018

Entity-Relationship Search over the Web

Entity-Relationship (E-R) Search is a complex case of Entity Search wher...
research
02/08/2020

Eliminating Search Intent Bias in Learning to Rank

Click-through data has proven to be a valuable resource for improving se...
research
01/10/2020

TableQnA: Answering List Intent Queries With Web Tables

The web contains a vast corpus of HTML tables. They can be used to provi...
research
06/16/2020

Query Intent Detection from the SEO Perspective

Google users have different intents from their queries such as acquiring...
research
10/28/2016

Representation Learning Models for Entity Search

We focus on the problem of learning distributed representations for enti...

Please sign up or login with your details

Forgot password? Click here to reset