Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks

04/05/2022
by   Tristan Thrush, et al.
0

We introduce Dynatask: an open source system for setting up custom NLP tasks that aims to greatly lower the technical knowledge and effort required for hosting and evaluating state-of-the-art NLP models, as well as for conducting model in the loop data collection with crowdworkers. Dynatask is integrated with Dynabench, a research platform for rethinking benchmarking in AI that facilitates human and model in the loop data collection and evaluation. To create a task, users only need to write a short task configuration file from which the relevant web interfaces and model hosting infrastructure are automatically generated. The system is available at https://dynabench.org/ and the full library can be found at https://github.com/facebookresearch/dynabench.

READ FULL TEXT
research
02/24/2023

CARE: Collaborative AI-Assisted Reading Environment

Recent years have seen impressive progress in AI-assisted writing, yet t...
research
03/02/2022

Mukayese: Turkish NLP Strikes Back

Having sufficient resources for language X lifts it from the under-resou...
research
11/24/2022

PyTAIL: Interactive and Incremental Learning of NLP Models with Human in the Loop for Online Data

Online data streams make training machine learning models hard because o...
research
07/19/2021

Human-in-the-Loop for Data Collection: a Multi-Target Counter Narrative Dataset to Fight Online Hate Speech

Undermining the impact of hateful content with informed and non-aggressi...
research
10/07/2020

What Can We Learn from Collective Human Opinions on Natural Language Inference Data?

Despite the subjective nature of many NLP tasks, most NLU evaluations ha...
research
10/07/2021

Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning in NLP

The principle of independent causal mechanisms (ICM) states that generat...
research
02/20/2023

Arena-Rosnav 2.0: A Development and Benchmarking Platform for Robot Navigation in Highly Dynamic Environments

Following up on our previous works, in this paper, we present Arena-Rosn...

Please sign up or login with your details

Forgot password? Click here to reset