Effect of Post-processing on Contextualized Word Representations

by   Hassan Sajjad, et al.

Post-processing of static embedding has beenshown to improve their performance on both lexical and sequence-level tasks. However, post-processing for contextualized embeddings is an under-studied problem. In this work, we question the usefulness of post-processing for contextualized embeddings obtained from different layers of pre-trained language models. More specifically, we standardize individual neuron activations using z-score, min-max normalization, and by removing top principle components using the all-but-the-top method. Additionally, we apply unit length normalization to word representations. On a diverse set of pre-trained models, we show that post-processing unwraps vital information present in the representations for both lexical tasks (such as word similarity and analogy)and sequence classification tasks. Our findings raise interesting points in relation to theresearch studies that use contextualized representations, and suggest z-score normalization as an essential step to consider when using them in an application.



There are no comments yet.


page 1

page 2

page 3

page 4


Effects of Pre- and Post-Processing on type-based Embeddings in Lexical Semantic Change Detection

Lexical semantic change detection is a new and innovative research field...

When in Doubt: Improving Classification Performance with Alternating Normalization

We introduce Classification with Alternating Normalization (CAN), a non-...

Mining Points of Interest via Address Embeddings: An Unsupervised Approach

Digital maps are commonly used across the globe for exploring places tha...

Post-Processing of Word Representations via Variance Normalization and Dynamic Embedding

Although embedded vector representations of words offer impressive perfo...

Learning post-processing for QRS detection using Recurrent Neural Network

Deep-learning based QRS-detection algorithms often require essential pos...

Word-Level Alignment of Paper Documents with their Electronic Full-Text Counterparts

We describe a simple procedure for the automatic creation of word-level ...

Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning

The system we used for Task 6 (Automated Audio Captioning)of the Detecti...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.