End-to-End Learning to Index and Search in Large Output Spaces

10/16/2022
by   Nilesh Gupta, et al.
22

Extreme multi-label classification (XMC) is a popular framework for solving many real-world problems that require accurate prediction from a very large number of potential output choices. A popular approach for dealing with the large label space is to arrange the labels into a shallow tree-based index and then learn an ML model to efficiently search this index via beam search. Existing methods initialize the tree index by clustering the label space into a few mutually exclusive clusters based on pre-defined features and keep it fixed throughout the training procedure. This approach results in a sub-optimal indexing structure over the label space and limits the search performance to the quality of choices made during the initialization of the index. In this paper, we propose a novel method ELIAS which relaxes the tree-based index to a specialized weighted graph-based index which is learned end-to-end with the final task objective. More specifically, ELIAS models the discrete cluster-to-label assignments in the existing tree-based index as soft learnable parameters that are learned jointly with the rest of the ML model. ELIAS achieves state-of-the-art performance on several large-scale extreme classification benchmarks with millions of labels. In particular, ELIAS can be up to 2.5 existing XMC methods. A PyTorch implementation of ELIAS along with other resources is available at https://github.com/nilesh2797/ELIAS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/24/2021

Label Disentanglement in Partition-based Extreme Multilabel Classification

Partition-based methods are increasingly-used in extreme multi-label cla...
research
04/17/2019

Bonsai - Diverse and Shallow Trees for Extreme Multi-label Classification

Extreme multi-label classification refers to supervised multi-label lear...
research
06/26/2023

AirIndex: Versatile Index Tuning Through Data and Storage

The end-to-end lookup latency of a hierarchical index – such as a B-tree...
research
10/12/2020

PECOS: Prediction for Enormous and Correlated Output Spaces

Many challenging problems in modern applications amount to finding relev...
research
03/04/2023

Learning Label Encodings for Deep Regression

Deep regression networks are widely used to tackle the problem of predic...
research
07/12/2020

Deep Retrieval: An End-to-End Learnable Structure Model for Large-Scale Recommendations

One of the core problems in large-scale recommendations is to retrieve t...
research
08/07/2022

Automatically Finding Optimal Index Structure

Existing learned indexes (e.g., RMI, ALEX, PGM) optimize the internal re...

Please sign up or login with your details

Forgot password? Click here to reset