DeepAI AI Chat
Log In Sign Up

Word-level Lexical Normalisation using Context-Dependent Embeddings

by   Michael Stewart, et al.
The University of Western Australia

Lexical normalisation (LN) is the process of correcting each word in a dataset to its canonical form so that it may be more easily and more accurately analysed. Most lexical normalisation systems operate at the character-level, while word-level models are seldom used. Recent language models offer solutions to the drawbacks of word-level LN models, yet, to the best of our knowledge, no research has investigated their effectiveness on LN. In this paper we introduce a word-level GRU-based LN model and investigate the effectiveness of recent embedding techniques on word-level LN. Our results show that our GRU-based word-level model produces greater results than character-level models, and outperforms existing deep-learning based LN techniques on Twitter data. We also find that randomly-initialised embeddings are capable of outperforming pre-trained embedding models in certain scenarios. Finally, we release a substantial lexical normalisation dataset to the community.


page 1

page 2

page 3

page 4


SenseBERT: Driving Some Sense into BERT

Self-supervision techniques have allowed neural language models to advan...

Syntax-Aware Language Modeling with Recurrent Neural Networks

Neural language models (LMs) are typically trained using only lexical fe...

Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings

Models of acoustic word embeddings (AWEs) learn to map variable-length s...

Antonym-Synonym Classification Based on New Sub-space Embeddings

Distinguishing antonyms from synonyms is a key challenge for many NLP ap...

Unsupervised Lexical Substitution with Decontextualised Embeddings

We propose a new unsupervised method for lexical substitution using pre-...

Swords: A Benchmark for Lexical Substitution with Improved Data Coverage and Quality

We release a new benchmark for lexical substitution, the task of finding...