Big Data Small Data, In Domain Out-of Domain, Known Word Unknown Word: The Impact of Word Representation on Sequence Labelling Tasks

04/21/2015
by   Lizhen Qu, et al.
0

Word embeddings -- distributed word representations that can be learned from unlabelled data -- have been shown to have high utility in many natural language processing applications. In this paper, we perform an extrinsic evaluation of five popular word embedding methods in the context of four sequence labelling tasks: POS-tagging, syntactic chunking, NER and MWE identification. A particular focus of the paper is analysing the effects of task-based updating of word representations. We show that when using word embeddings as features, as few as several hundred training instances are sufficient to achieve competitive results, and that word embeddings lead to improvements over OOV words and out of domain. Perhaps more surprisingly, our results indicate there is little difference between the different word embedding methods, and that simple Brown clusters are often competitive with word embeddings across all tasks we consider.

READ FULL TEXT
research
08/20/2017

Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks

Word embeddings have been found to provide meaningful representations fo...
research
11/25/2017

Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity

We evaluate 8 different word embedding models on their usefulness for pr...
research
09/24/2020

CogniFNN: A Fuzzy Neural Network Framework for Cognitive Word Embedding Evaluation

Word embeddings can reflect the semantic representations, and the embedd...
research
07/11/2016

Mapping distributional to model-theoretic semantic spaces: a baseline

Word embeddings have been shown to be useful across state-of-the-art sys...
research
11/06/2019

Invariance and identifiability issues for word embeddings

Word embeddings are commonly obtained as optimizers of a criterion funct...
research
02/07/2017

How to evaluate word embeddings? On importance of data efficiency and simple supervised tasks

Maybe the single most important goal of representation learning is makin...
research
11/12/2019

How to Evaluate Word Representations of Informal Domain?

Diverse word representations have surged in most state-of-the-art natura...

Please sign up or login with your details

Forgot password? Click here to reset