DeepAI AI Chat
Log In Sign Up

MuLVE, A Multi-Language Vocabulary Evaluation Data Set

by   Anik Jacobsen, et al.
Berlin Institute of Technology (Technische Universität Berlin)

Vocabulary learning is vital to foreign language learning. Correct and adequate feedback is essential to successful and satisfying vocabulary training. However, many vocabulary and language evaluation systems perform on simple rules and do not account for real-life user learning data. This work introduces Multi-Language Vocabulary Evaluation Data Set (MuLVE), a data set consisting of vocabulary cards and real-life user answers, labeled indicating whether the user answer is correct or incorrect. The data source is user learning data from the Phase6 vocabulary trainer. The data set contains vocabulary questions in German and English, Spanish, and French as target language and is available in four different variations regarding pre-processing and deduplication. We experiment to fine-tune pre-trained BERT language models on the downstream task of vocabulary evaluation with the proposed MuLVE data set. The results provide outstanding results of > 95.5 accuracy and F2-score. The data set is available on the European Language Grid.


page 1

page 2

page 3

page 4


Vocabulary Transfer for Medical Texts

Vocabulary transfer is a transfer learning subtask in which language mod...

MVP-BERT: Redesigning Vocabularies for Chinese BERT and Multi-Vocab Pretraining

Despite the development of pre-trained language models (PLMs) significan...

CultureBERT: Fine-Tuning Transformer-Based Language Models for Corporate Culture

This paper introduces supervised machine learning to the literature meas...

How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection

With the rapid development in deep learning, deep neural networks have b...

ExpMRC: Explainability Evaluation for Machine Reading Comprehension

Achieving human-level performance on some of Machine Reading Comprehensi...

Broccoli: Sprinkling Lightweight Vocabulary Learning into Everyday Information Diets

The learning of a new language remains to this date a cognitive task tha...

VocabulARy: Learning Vocabulary in AR Supported by Keyword Visualisations

Learning vocabulary in a primary or secondary language is enhanced when ...