Mukayese: Turkish NLP Strikes Back

03/02/2022
by   Ali Safaya, et al.
11

Having sufficient resources for language X lifts it from the under-resourced languages class, but not necessarily from the under-researched class. In this paper, we address the problem of the absence of organized benchmarks in the Turkish language. We demonstrate that languages such as Turkish are left behind the state-of-the-art in NLP applications. As a solution, we present Mukayese, a set of NLP benchmarks for the Turkish language that contains several NLP tasks. We work on one or more datasets for each benchmark and present two or more baselines. Moreover, we present four new benchmarking datasets in Turkish for language modeling, sentence segmentation, and spell checking. All datasets and baselines are available under: https://github.com/alisafaya/mukayese

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2022

L3Cube-MahaNLP: Marathi Natural Language Processing Datasets, Models, and Library

Despite being the third most popular language in India, the Marathi lang...
research
09/18/2020

FarsTail: A Persian Natural Language Inference Dataset

Natural language inference (NLI) is known as one of the central tasks in...
research
09/29/2021

Multilingual Fact Linking

Knowledge-intensive NLP tasks can benefit from linking natural language ...
research
04/05/2022

Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks

We introduce Dynatask: an open source system for setting up custom NLP t...
research
04/13/2021

EXPLAINABOARD: An Explainable Leaderboard for NLP

With the rapid development of NLP research, leaderboards have emerged as...
research
11/16/2021

DataCLUE: A Benchmark Suite for Data-centric NLP

Data-centric AI has recently proven to be more effective and high-perfor...
research
04/25/2022

How can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for the Cherokee Language

More than 43 language loss currently occurs at an accelerated rate becau...

Please sign up or login with your details

Forgot password? Click here to reset