DeepAI AI Chat
Log In Sign Up

KaWAT: A Word Analogy Task Dataset for Indonesian

06/17/2019
by   Kemal Kurniawan, et al.
Kata.ai
0

We introduced KaWAT (Kata Word Analogy Task), a new word analogy task dataset for Indonesian. We evaluated on it several existing pretrained Indonesian word embeddings and embeddings trained on Indonesian online news corpus. We also tested them on two downstream tasks and found that pretrained word embeddings helped either by reducing the training epochs or yielding significant performance gains.

READ FULL TEXT

page 1

page 2

page 3

09/30/2020

Development of Word Embeddings for Uzbek Language

In this paper, we share the process of developing word embeddings for th...
11/05/2019

Incremental Sense Weight Training for the Interpretation of Contextualized Word Embeddings

We present a novel online algorithm that learns the essence of each dime...
04/16/2018

A Deeper Look into Dependency-Based Word Embeddings

We investigate the effect of various dependency-based word embeddings on...
01/22/2019

Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings

We propose a novel and simple method for semi-supervised text classifica...
04/16/2021

Word2rate: training and evaluating multiple word embeddings as statistical transitions

Using pretrained word embeddings has been shown to be a very effective w...
05/18/2020

Contextual Embeddings: When Are They Worth It?

We study the settings for which deep contextual embeddings (e.g., BERT) ...
11/06/2019

Invariance and identifiability issues for word embeddings

Word embeddings are commonly obtained as optimizers of a criterion funct...

Code Repositories

kawat

Kata Word Analogy Task for Indonesian.


view repo