PWESuite: Phonetic Word Embeddings and Tasks They Facilitate

04/05/2023
by   Vilém Zouhar, et al.
0

Word embeddings that map words into a fixed-dimensional vector space are the backbone of modern NLP. Most word embedding methods encode semantic information. However, phonetic information, which is important for some tasks, is often overlooked. In this work, we develop several novel methods which leverage articulatory features to build phonetically informed word embeddings, and present a set of phonetic word embeddings to encourage their community development, evaluation and use. While several methods for learning phonetic word embeddings already exist, there is a lack of consistency in evaluating their effectiveness. Thus, we also proposes several ways to evaluate both intrinsic aspects of phonetic word embeddings, such as word retrieval and correlation with sound similarity, and extrinsic performances, such as rhyme and cognate detection and sound analogies. We hope that our suite of tasks will promote reproducibility and provide direction for future research on phonetic word embeddings.

READ FULL TEXT
research
11/12/2020

Deconstructing word embedding algorithms

Word embeddings are reliable feature representations of words used to ob...
research
04/09/2019

Characterizing the impact of geometric properties of word embeddings on task performance

Analysis of word embedding properties to inform their use in downstream ...
research
09/05/2018

Firearms and Tigers are Dangerous, Kitchen Knives and Zebras are Not: Testing whether Word Embeddings Can Tell

This paper presents an approach for investigating the nature of semantic...
research
06/16/2022

TransDrift: Modeling Word-Embedding Drift using Transformer

In modern NLP applications, word embeddings are a crucial backbone that ...
research
12/01/2020

Intrinsic analysis for dual word embedding space models

Recent word embeddings techniques represent words in a continuous vector...
research
08/14/2018

Embedding Grammars

Classic grammars and regular expressions can be used for a variety of pu...
research
09/04/2017

Hypothesis Testing based Intrinsic Evaluation of Word Embeddings

We introduce the cross-match test - an exact, distribution free, high-di...

Please sign up or login with your details

Forgot password? Click here to reset