Question Answering and Question Generation for Finnish

11/24/2022
by   Ilmari Kylliäinen, et al.
0

Recent advances in the field of language modeling have improved the state-of-the-art in question answering (QA) and question generation (QG). However, the development of modern neural models, their benchmarks, and datasets for training them has mainly focused on English. Finnish, like many other languages, faces a shortage of large QA/QG model training resources, which has prevented experimenting with state-of-the-art QA/QG fine-tuning methods. We present the first neural QA and QG models that work with Finnish. To train the models, we automatically translate the SQuAD dataset and then use normalization methods to reduce the amount of problematic data created during the translation. Using the synthetic data, together with the Finnish partition of the TyDi-QA dataset, we fine-tune several transformer-based models to both QA and QG and evaluate their performance. To the best of our knowledge, the resulting dataset is the first large-scale QA/QG resource for Finnish. This paper also sets the initial benchmarks for Finnish-language QA and QG.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2019

Automatic Spanish Translation of the SQuAD Dataset for Multilingual Question Answering

Recently, multilingual question answering became a crucial research topi...
research
11/30/2022

A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering

Question Answering (QA) is a growing area of research, often used to fac...
research
05/02/2020

UnifiedQA: Crossing Format Boundaries With a Single QA System

Question answering (QA) tasks have been posed using a variety of formats...
research
02/04/2022

Pirá: A Bilingual Portuguese-English Dataset for Question-Answering about the Ocean

Current research in natural language processing is highly dependent on c...
research
04/24/2020

Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering

Question Answering (QA) is in increasing demand as the amount of informa...
research
04/05/2022

Improved and Efficient Conversational Slot Labeling through Question Answering

Transformer-based pretrained language models (PLMs) offer unmatched perf...
research
09/19/2023

QASnowball: An Iterative Bootstrapping Framework for High-Quality Question-Answering Data Generation

Recent years have witnessed the success of question answering (QA), espe...

Please sign up or login with your details

Forgot password? Click here to reset