An Exploration of Word Embedding Initialization in Deep-Learning Tasks

11/24/2017
by   Tom Kocmi, et al.
0

Word embeddings are the interface between the world of discrete units of text processing and the continuous, differentiable world of neural networks. In this work, we examine various random and pretrained initialization methods for embeddings used in deep networks and their effect on the performance on four NLP tasks with both recurrent and convolutional architectures. We confirm that pretrained embeddings are a little better than random initialization, especially considering the speed of learning. On the other hand, we do not see any significant difference between various methods of random initialization, as long as the variance is kept reasonably low. High-variance initialization prevents the network to use the space of embeddings and forces it to use other free parameters to accomplish the task. We support this hypothesis by observing the performance in learning lexical relations and by the fact that the network can learn to perform reasonably in its task even with fixed random embeddings.

READ FULL TEXT
research
10/05/2017

On the Effective Use of Pretraining for Natural Language Inference

Neural networks have excelled at many NLP tasks, but there remain open q...
research
08/09/2022

Where's the Learning in Representation Learning for Compositional Semantics and the Case of Thematic Fit

Observing that for certain NLP tasks, such as semantic role prediction o...
research
04/01/2021

Evaluating Neural Word Embeddings for Sanskrit

Recently, the supervised learning paradigm's surprisingly remarkable per...
research
07/10/2020

GloVeInit at SemEval-2020 Task 1: Using GloVe Vector Initialization for Unsupervised Lexical Semantic Change Detection

This paper presents a vector initialization approach for the SemEval2020...
research
07/22/2020

Rethinking CNN Models for Audio Classification

In this paper, we show that ImageNet-Pretrained standard deep CNN models...
research
01/15/2013

The Expressive Power of Word Embeddings

We seek to better understand the difference in quality of the several pu...
research
06/06/2019

Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP

The lottery ticket hypothesis proposes that over-parameterization of dee...

Please sign up or login with your details

Forgot password? Click here to reset