Lingua Manga: A Generic Large Language Model Centric System for Data Curation

06/20/2023
by   Zui Chen, et al.
0

Data curation is a wide-ranging area which contains many critical but time-consuming data processing tasks. However, the diversity of such tasks makes it challenging to develop a general-purpose data curation system. To address this issue, we present Lingua Manga, a user-friendly and versatile system that utilizes pre-trained large language models. Lingua Manga offers automatic optimization for achieving high performance and label efficiency while facilitating flexible and rapid development. Through three example applications with distinct objectives and users of varying levels of technical proficiency, we demonstrate that Lingua Manga can effectively assist both skilled programmers and low-code or even no-code users in addressing data curation challenges.

READ FULL TEXT

page 1

page 4

research
09/05/2023

Data-Juicer: A One-Stop Data Processing System for Large Language Models

The immense evolution in Large Language Models (LLMs) has underscored th...
research
02/21/2021

Automatic Code Generation using Pre-Trained Language Models

Recent advancements in natural language processing <cit.> <cit.> have le...
research
09/17/2023

Performance of the Pre-Trained Large Language Model GPT-4 on Automated Short Answer Grading

Automated Short Answer Grading (ASAG) has been an active area of machine...
research
10/31/2022

When Language Model Meets Private Library

With the rapid development of pre-training techniques, a number of langu...
research
01/16/2023

Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling

As opposed to scaling-up protein language models (PLMs), we seek improvi...
research
04/17/2023

Low-code LLM: Visual Programming over LLMs

Effectively utilizing LLMs for complex tasks is challenging, often invol...

Please sign up or login with your details

Forgot password? Click here to reset