How to Evaluate Word Representations of Informal Domain?

11/12/2019
by   Yekun Chai, et al.
0

Diverse word representations have surged in most state-of-the-art natural language processing (NLP) applications. Nevertheless, how to efficiently evaluate such word embeddings in the informal domain such as Twitter or forums, remains an ongoing challenge due to the lack of sufficient evaluation dataset. We derived a large list of variant spelling pairs from UrbanDictionary with the automatic approaches of weakly-supervised pattern-based bootstrapping and self-training linear-chain conditional random field (CRF). With these extracted relation pairs we promote the odds of eliding the text normalization procedure of traditional NLP pipelines and directly adopting representations of non-standard words in the informal domain. Our code is available.

READ FULL TEXT

page 6

page 7

page 13

research
11/28/2019

A New Corpus for Low-Resourced Sindhi Language with Word Embeddings

Representing words and phrases into dense vectors of real numbers which ...
research
04/21/2015

Big Data Small Data, In Domain Out-of Domain, Known Word Unknown Word: The Impact of Word Representation on Sequence Labelling Tasks

Word embeddings -- distributed word representations that can be learned ...
research
10/28/2020

A Comprehensive Survey on Word Representation Models: From Classical to State-Of-The-Art Word Representation Language Models

Word representation has always been an important research area in the hi...
research
11/19/2015

sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings

Neural word representations have proven useful in Natural Language Proce...
research
06/27/2016

Evaluating Informal-Domain Word Representations With UrbanDictionary

Existing corpora for intrinsic evaluation are not targeted towards tasks...
research
12/30/2020

A Subword Guided Neural Word Segmentation Model for Sindhi

Deep neural networks employ multiple processing layers for learning text...
research
06/05/2019

Entity-Centric Contextual Affective Analysis

While contextualized word representations have improved state-of-the-art...

Please sign up or login with your details

Forgot password? Click here to reset