FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

02/16/2022
by   Minh Van Nguyen, et al.
0

This paper presents FAMIE, a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction. FAMIE is designed to address a fundamental problem in existing AL frameworks where annotators need to wait for a long time between annotation batches due to the time-consuming nature of model training and data selection at each AL iteration. This hinders the engagement, productivity, and efficiency of annotators. Based on the idea of using a small proxy network for fast data selection, we introduce a novel knowledge distillation mechanism to synchronize the proxy network with the main large model (i.e., BERT-based) to ensure the appropriateness of the selected annotation examples for the main model. Our AL framework can support multiple languages. The experiments demonstrate the advantages of FAMIE in terms of competitive performance and time efficiency for sequence labeling with AL. We publicly release our code (<https://github.com/nlp-uoregon/famie>) and demo website (<http://nlp.uoregon.edu:9000/>). A demo video for FAMIE is provided at: <https://youtu.be/I2i8n_jAyrY>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2021

Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing

We introduce Trankit, a light-weight Transformer-based Toolkit for multi...
research
11/07/2022

AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages

In recent years, multilingual pre-trained language models have gained pr...
research
07/19/2022

Active-Learning-as-a-Service: An Efficient MLOps System for Data-Centric AI

The success of today's AI applications requires not only model training ...
research
08/21/2023

Test-time augmentation-based active learning and self-training for label-efficient segmentation

Deep learning techniques depend on large datasets whose annotation is ti...
research
12/16/2022

POTATO: The Portable Text Annotation Tool

We present POTATO, the Portable text annotation tool, a free, fully open...
research
04/08/2023

PVD-AL: Progressive Volume Distillation with Active Learning for Efficient Conversion Between Different NeRF Architectures

Neural Radiance Fields (NeRF) have been widely adopted as practical and ...
research
01/30/2023

Active Learning for Multilingual Semantic Parser

Current multilingual semantic parsing (MSP) datasets are almost all coll...

Please sign up or login with your details

Forgot password? Click here to reset