DeepAI AI Chat
Log In Sign Up

MuLVE, A Multi-Language Vocabulary Evaluation Data Set

01/17/2022
by   Anik Jacobsen, et al.
Berlin Institute of Technology (Technische Universität Berlin)
0

Vocabulary learning is vital to foreign language learning. Correct and adequate feedback is essential to successful and satisfying vocabulary training. However, many vocabulary and language evaluation systems perform on simple rules and do not account for real-life user learning data. This work introduces Multi-Language Vocabulary Evaluation Data Set (MuLVE), a data set consisting of vocabulary cards and real-life user answers, labeled indicating whether the user answer is correct or incorrect. The data source is user learning data from the Phase6 vocabulary trainer. The data set contains vocabulary questions in German and English, Spanish, and French as target language and is available in four different variations regarding pre-processing and deduplication. We experiment to fine-tune pre-trained BERT language models on the downstream task of vocabulary evaluation with the proposed MuLVE data set. The results provide outstanding results of > 95.5 accuracy and F2-score. The data set is available on the European Language Grid.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/04/2022

Vocabulary Transfer for Medical Texts

Vocabulary transfer is a transfer learning subtask in which language mod...
11/17/2020

MVP-BERT: Redesigning Vocabularies for Chinese BERT and Multi-Vocab Pretraining

Despite the development of pre-trained language models (PLMs) significan...
12/01/2022

CultureBERT: Fine-Tuning Transformer-Based Language Models for Corporate Culture

This paper introduces supervised machine learning to the literature meas...
02/27/2019

How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection

With the rapid development in deep learning, deep neural networks have b...
05/10/2021

ExpMRC: Explainability Evaluation for Machine Reading Comprehension

Achieving human-level performance on some of Machine Reading Comprehensi...
04/16/2021

Broccoli: Sprinkling Lightweight Vocabulary Learning into Everyday Information Diets

The learning of a new language remains to this date a cognitive task tha...
07/02/2022

VocabulARy: Learning Vocabulary in AR Supported by Keyword Visualisations

Learning vocabulary in a primary or secondary language is enhanced when ...