AutoDC: Automated data-centric processing

11/23/2021
by   Zac Yung-Chun Liu, et al.
0

AutoML (automated machine learning) has been extensively developed in the past few years for the model-centric approach. As for the data-centric approach, the processes to improve the dataset, such as fixing incorrect labels, adding examples that represent edge cases, and applying data augmentation, are still very artisanal and expensive. Here we develop an automated data-centric tool (AutoDC), similar to the purpose of AutoML, aims to speed up the dataset improvement processes. In our preliminary tests on 3 open source image classification datasets, AutoDC is estimated to reduce roughly 80 of the manual time for data improvement tasks, at the same time, improve the model accuracy by 10-15

READ FULL TEXT
research
03/08/2022

Towards Efficient Data-Centric Robust Machine Learning with Noise-based Augmentation

The data-centric machine learning aims to find effective ways to build a...
research
05/17/2023

Kitana: Efficient Data Augmentation Search for AutoML

AutoML services provide a way for non-expert users to benefit from high-...
research
12/07/2021

Augment Valuate : A Data Enhancement Pipeline for Data-Centric AI

Data scarcity and noise are important issues in industrial applications ...
research
07/14/2023

DataAssist: A Machine Learning Approach to Data Cleaning and Preparation

Current automated machine learning (ML) tools are model-centric, focusin...
research
11/01/2021

A New Tool for Efficiently Generating Quality Estimation Datasets

Building of data for quality estimation (QE) training is expensive and r...
research
11/05/2021

Increasing Data Diversity with Iterative Sampling to Improve Performance

As a part of the Data-Centric AI Competition, we propose a data-centric ...
research
06/27/2023

DataCI: A Platform for Data-Centric AI on Streaming Data

We introduce DataCI, a comprehensive open-source platform designed speci...

Please sign up or login with your details

Forgot password? Click here to reset