Unsupervised Learning of Style-sensitive Word Vectors

05/15/2018
by   Reina Akama, et al.
0

This paper presents the first study aimed at capturing stylistic similarity between words in an unsupervised manner. We propose extending the continuous bag of words (CBOW) model (Mikolov et al., 2013) to learn style-sensitive word vectors using a wider context window under the assumption that the style of all the words in an utterance is consistent. In addition, we introduce a novel task to predict lexical stylistic similarity and to create a benchmark dataset for this task. Our experiment with this dataset supports our assumption and demonstrates that the proposed extensions contribute to the acquisition of style-sensitive word embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2022

Word Embeddings Are Capable of Capturing Rhythmic Similarity of Words

Word embedding systems such as Word2Vec and GloVe are well-known in deep...
research
09/12/2018

Generalizing Word Embeddings using Bag of Subwords

We approach the problem of generalizing pre-trained word embeddings beyo...
research
08/28/2018

WiC: 10,000 Example Pairs for Evaluating Context-Sensitive Representations

By design, word embeddings are unable to model the dynamic nature of wor...
research
06/01/2019

COS960: A Chinese Word Similarity Dataset of 960 Word Pairs

Word similarity computation is a widely recognized task in the field of ...
research
08/10/2018

Unsupervised Keyphrase Extraction Based on Outlier Detection

We propose a novel unsupervised keyphrase extraction approach based on o...
research
11/17/2015

Learning the Dimensionality of Word Embeddings

We describe a method for learning word embeddings with data-dependent di...
research
07/12/2017

A Critique of a Critique of Word Similarity Datasets: Sanity Check or Unnecessary Confusion?

Critical evaluation of word similarity datasets is very important for co...

Please sign up or login with your details

Forgot password? Click here to reset