On Isotropy Calibration of Transformers

09/27/2021
by   Yue Ding, et al.
0

Different studies of the embedding space of transformer models suggest that the distribution of contextual representations is highly anisotropic - the embeddings are distributed in a narrow cone. Meanwhile, static word representations (e.g., Word2Vec or GloVe) have been shown to benefit from isotropic spaces. Therefore, previous work has developed methods to calibrate the embedding space of transformers in order to ensure isotropy. However, a recent study (Cai et al. 2021) shows that the embedding space of transformers is locally isotropic, which suggests that these models are already capable of exploiting the expressive capacity of their embedding space. In this work, we conduct an empirical evaluation of state-of-the-art methods for isotropy calibration on transformers and find that they do not provide consistent improvements across models and tasks. These results support the thesis that, given the local isotropy, transformers do not benefit from additional isotropy calibration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2022

Analyzing Transformers in Embedding Space

Understanding Transformer-based models has attracted significant attenti...
research
06/02/2021

A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space

The representation degeneration problem in Contextual Word Representatio...
research
08/21/2023

Analyzing Transformer Dynamics as Movement through Embedding Space

Transformer language models exhibit intelligent behaviors such as unders...
research
03/30/2019

Learning Semantic Embedding Spaces for Slicing Vegetables

In this work, we present an interaction-based approach to learn semantic...
research
06/07/2022

How to Dissect a Muppet: The Structure of Transformer Embedding Spaces

Pretrained embeddings based on the Transformer architecture have taken t...
research
07/27/2022

A Variational AutoEncoder for Transformers with Nonparametric Variational Information Bottleneck

We propose a VAE for Transformers by developing a variational informatio...
research
07/27/2021

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

This paper presents a neural network built upon Transformers, namely Pla...

Please sign up or login with your details

Forgot password? Click here to reset