Dense Retrieval Adaptation using Target Domain Description

07/06/2023
by   Helia Hashemi, et al.
0

In information retrieval (IR), domain adaptation is the process of adapting a retrieval model to a new domain whose data distribution is different from the source domain. Existing methods in this area focus on unsupervised domain adaptation where they have access to the target document collection or supervised (often few-shot) domain adaptation where they additionally have access to (limited) labeled data in the target domain. There also exists research on improving zero-shot performance of retrieval models with no adaptation. This paper introduces a new category of domain adaptation in IR that is as-yet unexplored. Here, similar to the zero-shot setting, we assume the retrieval model does not have access to the target document collection. In contrast, it does have access to a brief textual description that explains the target domain. We define a taxonomy of domain attributes in retrieval tasks to understand different properties of a source domain that can be adapted to a target domain. We introduce a novel automatic data construction pipeline that produces a synthetic document collection, query set, and pseudo relevance labels, given a textual domain description. Extensive experiments on five diverse target domains show that adapting dense retrieval models using the constructed synthetic data leads to effective retrieval performance on the target domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2015

Zero-Shot Domain Adaptation via Kernel Regression on the Grassmannian

Most visual recognition methods implicitly assume the data distribution ...
research
07/20/2020

Unsupervised Domain Adaptation in the Absence of Source Data

Current unsupervised domain adaptation methods can address many types of...
research
12/13/2022

Domain Adaptation for Dense Retrieval through Self-Supervision by Pseudo-Relevance Labeling

Although neural information retrieval has witnessed great improvements, ...
research
07/14/2023

Unsupervised Domain Adaptation using Lexical Transformations and Label Injection for Twitter Data

Domain adaptation is an important and widely studied problem in natural ...
research
11/08/2022

Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps

IR models using a pretrained language model significantly outperform lex...
research
06/23/2020

Inductive Unsupervised Domain Adaptation for Few-Shot Classification via Clustering

Few-shot classification tends to struggle when it needs to adapt to dive...
research
04/20/2022

DAME: Domain Adaptation for Matching Entities

Entity matching (EM) identifies data records that refer to the same real...

Please sign up or login with your details

Forgot password? Click here to reset