Generate to Understand for Representation

06/14/2023
by   Changshang Xue, et al.
0

In recent years, a significant number of high-quality pretrained models have emerged, greatly impacting Natural Language Understanding (NLU), Natural Language Generation (NLG), and Text Representation tasks. Traditionally, these models are pretrained on custom domain corpora and finetuned for specific tasks, resulting in high costs related to GPU usage and labor. Unfortunately, recent trends in language modeling have shifted towards enhancing performance through scaling, further exacerbating the associated costs. Introducing GUR: a pretraining framework that combines language modeling and contrastive learning objectives in a single training step. We select similar text pairs based on their Longest Common Substring (LCS) from raw unlabeled documents and train the model using masked language modeling and unsupervised contrastive learning. The resulting model, GUR, achieves impressive results without any labeled training data, outperforming all other pretrained baselines as a retriever at the recall benchmark in a zero-shot setting. Additionally, GUR maintains its language modeling ability, as demonstrated in our ablation experiment. Our code is available at <https://github.com/laohur/GUR>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2022

What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?

Large pretrained Transformer language models have been shown to exhibit ...
research
02/25/2021

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned and Perspectives

Modern natural language processing (NLP) methods employ self-supervised ...
research
05/03/2022

Contrastive Learning for Prompt-Based Few-Shot Language Learners

The impressive performance of GPT-3 using natural language prompts and i...
research
03/24/2023

Accelerating Vision-Language Pretraining with Free Language Modeling

The state of the arts in vision-language pretraining (VLP) achieves exem...
research
06/09/2023

DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents

Vision-language pretraining models have achieved great success in suppor...
research
11/06/2017

Distributed Representation for Traditional Chinese Medicine Herb via Deep Learning Models

Traditional Chinese Medicine (TCM) has accumulated a big amount of preci...
research
05/11/2023

An Inverse Scaling Law for CLIP Training

CLIP, the first foundation model that connects images and text, has enab...

Please sign up or login with your details

Forgot password? Click here to reset