hauWE: Hausa Words Embedding for Natural Language Processing

11/25/2019
by   Idris Abdulmumin, et al.
0

Words embedding (distributed word vector representations) have become an essential component of many natural language processing (NLP) tasks such as machine translation, sentiment analysis, word analogy, named entity recognition and word similarity. Despite this, the only work that provides word vectors for Hausa language is that of Bojanowski et al. [1] trained using fastText, consisting of only a few words vectors. This work presents words embedding models using Word2Vec's Continuous Bag of Words (CBoW) and Skip Gram (SG) models. The models, hauWE (Hausa Words Embedding), are bigger and better than the only previous model, making them more useful in NLP tasks. To compare the models, they were used to predict the 10 most similar words to 30 randomly selected Hausa words. hauWE CBoW's 88.7 accuracy greatly outperformed Bojanowski et al. [1]'s 22.3

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2017

A Syllable-based Technique for Word Embeddings of Korean Words

Word embedding has become a fundamental component to many NLP tasks such...
research
07/15/2018

Concept-Based Embeddings for Natural Language Processing

In this work, we focus on effectively leveraging and integrating informa...
research
01/02/2023

Tsetlin Machine Embedding: Representing Words Using Logical Expressions

Embedding words in vector space is a fundamental first step in state-of-...
research
07/14/2020

Deep learning models for representing out-of-vocabulary words

Communication has become increasingly dynamic with the popularization of...
research
06/04/2019

Transferable Neural Projection Representations

Neural word representations are at the core of many state-of-the-art nat...
research
06/27/2016

Network-Efficient Distributed Word2vec Training System for Large Vocabularies

Word2vec is a popular family of algorithms for unsupervised training of ...
research
01/30/2019

Tensorized Embedding Layers for Efficient Model Compression

The embedding layers transforming input words into real vectors are the ...

Please sign up or login with your details

Forgot password? Click here to reset