Updating Pre-trained Word Vectors and Text Classifiers using Monolingual Alignment

10/14/2019
by   Piotr Bojanowski, et al.
0

In this paper, we focus on the problem of adapting word vector-based models to new textual data. Given a model pre-trained on large reference data, how can we adapt it to a smaller piece of data with a slightly different language distribution? We frame the adaptation problem as a monolingual word vector alignment problem, and simply average models after alignment. We align vectors using the RCSLS criterion. Our formulation results in a simple and efficient algorithm that allows adapting general-purpose models to changing word distributions. In our evaluation, we consider applications to word embedding and text classification models. We show that the proposed approach yields good performance in all setups and outperforms a baseline consisting in fine-tuning the model on new data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2021

Word Alignment by Fine-tuning Embeddings on Parallel Corpora

Word alignment over parallel corpora has a wide variety of applications,...
research
06/04/2021

Neural semi-Markov CRF for Monolingual Word Alignment

Monolingual word alignment is important for studying fine-grained editin...
research
08/07/2019

A Simple and Effective Approach for Fine Tuning Pre-trained Word Embeddings for Improved Text Classification

This work presents a new and simple approach for fine-tuning pretrained ...
research
11/15/2022

ALIGN-MLM: Word Embedding Alignment is Crucial for Multilingual Pre-training

Multilingual pre-trained models exhibit zero-shot cross-lingual transfer...
research
05/31/2023

Analyzing Text Representations by Measuring Task Alignment

Textual representations based on pre-trained language models are key, es...
research
10/18/2022

On the Information Content of Predictions in Word Analogy Tests

An approach is proposed to quantify, in bits of information, the actual ...
research
01/01/2021

Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment

Bilingual lexicons map words in one language to their translations in an...

Please sign up or login with your details

Forgot password? Click here to reset