Log In Sign Up

Learning Universal Representations from Word to Sentence

by   Yian Li, et al.

Despite the well-developed cut-edge representation learning for language, most language representation models usually focus on specific level of linguistic unit, which cause great inconvenience when being confronted with handling multiple layers of linguistic objects in a unified way. Thus this work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space through a task-independent evaluation. We present our approach of constructing analogy datasets in terms of words, phrases and sentences and experiment with multiple representation models to examine geometric properties of the learned vector space. Then we empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation. Especially, our implementation of fine-tuning ALBERT on NLI and PPDB datasets achieves the highest accuracy on analogy tasks in different language levels. Further experiments on the insurance FAQ task show effectiveness of universal representation models in real-world applications.


Pre-training Universal Language Representation

Despite the well-developed cut-edge representation learning for language...

BURT: BERT-inspired Universal Representation from Learning Meaningful Segment

Although pre-trained contextualized language models such as BERT achieve...

Learning Better Universal Representations from Pre-trained Contextualized Language Models

Pre-trained contextualized language models such as BERT have shown great...

A Simple Geometric Method for Cross-Lingual Linguistic Transformations with Pre-trained Autoencoders

Powerful sentence encoders trained for multiple languages are on the ris...

Document Classification by Inversion of Distributed Language Representations

There have been many recent advances in the structure and measurement of...

Simplicial Complex Representation Learning

Simplicial complexes form an important class of topological spaces that ...

Introducing Orthogonal Constraint in Structural Probes

With the recent success of pre-trained models in NLP, a significant focu...