Spontaneous Emerging Preference in Two-tower Language Model

10/13/2022
by   Zhengqi He, et al.
0

The ever-growing size of the foundation language model has brought significant performance gains in various types of downstream tasks. With the existence of side-effects brought about by the large size of the foundation language model such as deployment cost, availability issues, and environmental cost, there is some interest in exploring other possible directions, such as a divide-and-conquer scheme. In this paper, we are asking a basic question: are language processes naturally dividable? We study this problem with a simple two-tower language model setting, where two language models with identical configurations are trained side-by-side cooperatively. With this setting, we discover the spontaneous emerging preference phenomenon, where some of the tokens are consistently better predicted by one tower while others by another tower. This phenomenon is qualitatively stable, regardless of model configuration and type, suggesting this as an intrinsic property of natural language. This study suggests that interesting properties of natural language are still waiting to be discovered, which may aid the future development of natural language processing techniques.

READ FULL TEXT
research
11/15/2019

A Subword Level Language Model for Bangla Language

Language models are at the core of natural language processing. The abil...
research
07/06/2023

Can ChatGPT's Responses Boost Traditional Natural Language Processing?

The employment of foundation models is steadily expanding, especially wi...
research
08/11/2023

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Generative language models (LMs) have become omnipresent across data sci...
research
11/15/2022

RobBERT-2022: Updating a Dutch Language Model to Account for Evolving Language Use

Large transformer-based language models, e.g. BERT and GPT-3, outperform...
research
01/28/2022

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model

Pretrained general-purpose language models can achieve state-of-the-art ...
research
03/03/2023

Will Affective Computing Emerge from Foundation Models and General AI? A First Evaluation on ChatGPT

ChatGPT has shown the potential of emerging general artificial intellige...

Please sign up or login with your details

Forgot password? Click here to reset