MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale

10/02/2020
by   Andreas Rücklé, et al.
0

We study the zero-shot transfer capabilities of text matching models on a massive scale, by self-supervised training on 140 source domains from community question answering forums in English. We investigate the model performances on nine benchmarks of answer selection and question similarity tasks, and show that all 140 models transfer surprisingly well, where the large majority of models substantially outperforms common IR baselines. We also demonstrate that considering a broad selection of source domains is crucial for obtaining the best zero-shot transfer performances, which contrasts the standard procedure that merely relies on the largest and most similar domains. In addition, we extensively study how to best combine multiple source domains. We propose to incorporate self-supervised with supervised multi-task learning on all available source domains. Our best zero-shot transfer model considerably outperforms in-domain BERT and the previous state of the art on six benchmarks. Fine-tuning of our model with in-domain data results in additional large gains and achieves the new state of the art on all nine benchmarks.

READ FULL TEXT
research
12/20/2022

Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language Models

With increasing scale, large language models demonstrate both quantitati...
research
05/12/2022

Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models

Massively Multilingual Transformer based Language Models have been obser...
research
09/10/2021

Towards Zero-shot Commonsense Reasoning with Self-supervised Refinement of Language Models

Can we get existing language models and refine them for zero-shot common...
research
06/15/2021

Interpretable Self-supervised Multi-task Learning for COVID-19 Information Retrieval and Extraction

The rapidly evolving literature of COVID-19 related articles makes it ch...
research
05/04/2023

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

The retrieval model is an indispensable component for real-world knowled...
research
05/23/2023

Leveraging Open Information Extraction for Improving Few-Shot Trigger Detection Domain Transfer

Event detection is a crucial information extraction task in many domains...
research
08/29/2018

Zero-Shot Adaptive Transfer for Conversational Language Understanding

Conversational agents such as Alexa and Google Assistant constantly need...

Please sign up or login with your details

Forgot password? Click here to reset