Language Model Pre-Training with Sparse Latent Typing

10/23/2022
by   Liliang Ren, et al.
0

Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our code is publicly available at: https://github.com/renll/SparseLT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2020

Pre-Training a Language Model Without Human Language

In this paper, we study how the intrinsic nature of pre-training data co...
research
12/12/2020

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models

The computer vision world has been re-gaining enthusiasm in various pre-...
research
03/12/2022

ELLE: Efficient Lifelong Pre-training for Emerging Data

Current pre-trained language models (PLM) are typically trained with sta...
research
10/14/2020

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision

Humans learn language by listening, speaking, writing, reading, and also...
research
03/25/2022

Reinforcement Learning with Action-Free Pre-Training from Videos

Recent unsupervised pre-training methods have shown to be effective on l...
research
06/11/2023

QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search

In light of the success of the pre-trained language models (PLMs), conti...
research
04/30/2022

Foundational Models for Continual Learning: An Empirical Study of Latent Replay

Rapid development of large-scale pre-training has resulted in foundation...

Please sign up or login with your details

Forgot password? Click here to reset