Rethinking Dense Retrieval's Few-Shot Ability

04/12/2023
by   Si Sun, et al.
0

Few-shot dense retrieval (DR) aims to effectively generalize to novel search scenarios by learning a few samples. Despite its importance, there is little study on specialized datasets and standardized evaluation protocols. As a result, current methods often resort to random sampling from supervised datasets to create "few-data" setups and employ inconsistent training strategies during evaluations, which poses a challenge in accurately comparing recent progress. In this paper, we propose a customized FewDR dataset and a unified evaluation benchmark. Specifically, FewDR employs class-wise sampling to establish a standardized "few-shot" setting with finely-defined classes, reducing variability in multiple sampling rounds. Moreover, the dataset is disjointed into base and novel classes, allowing DR models to be continuously trained on ample data from base classes and a few samples in novel classes. This benchmark eliminates the risk of novel class leakage, providing a reliable estimation of the DR model's few-shot ability. Our extensive empirical results reveal that current state-of-the-art DR models still face challenges in the standard few-shot scene. Our code and data will be open-sourced at https://github.com/OpenMatch/ANCE-Tele.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2022

COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning

We present a new zero-shot dense retrieval (ZeroDR) method, COCO-DR, to ...
research
05/05/2022

Generating Representative Samples for Few-Shot Classification

Few-shot learning (FSL) aims to learn new categories with a few visual s...
research
10/14/2021

Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations

Dense retrieval (DR) methods conduct text retrieval by first encoding te...
research
02/25/2022

Asyncval: A Toolkit for Asynchronously Validating Dense Retriever Checkpoints during Training

The process of model checkpoint validation refers to the evaluation of t...
research
05/10/2021

Few-Shot Conversational Dense Retrieval

Dense retrieval (DR) has the potential to resolve the query understandin...
research
02/15/2023

How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval

Various techniques have been developed in recent years to improve dense ...
research
12/22/2020

Progressive One-shot Human Parsing

Prior human parsing models are limited to parsing humans into classes pr...

Please sign up or login with your details

Forgot password? Click here to reset