Auctus: A Dataset Search Engine for Data Augmentation

02/10/2021
by   Fernando Chirigati, et al.
0

Machine Learning models are increasingly being adopted in many applications. The quality of these models critically depends on the input data on which they are trained, and by augmenting their input data with external data, we have the opportunity to create better models. However, the massive number of datasets available on the Web makes it challenging to find data suitable for augmentation. In this demo, we present our ongoing efforts to develop a dataset search engine tailored for data augmentation. Our prototype, named Auctus, automatically discovers datasets on the Web and, different from existing dataset search engines, infers consistent metadata for indexing and supports join and union search queries. Auctus is already being used in a real deployment environment to improve the performance of ML models. The demonstration will include various real-world data augmentation examples and visitors will be able to interact with the system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2020

ARDA: Automatic Relational Data Augmentation for Machine Learning

Automatic machine learning () is a family of techniques to automate the ...
research
05/17/2023

Kitana: Efficient Data Augmentation Search for AutoML

AutoML services provide a way for non-expert users to benefit from high-...
research
09/29/2022

Augmentation Backdoors

Data augmentation is used extensively to improve model generalisation. H...
research
09/29/2022

Automatic Data Augmentation via Invariance-Constrained Learning

Underlying data structures, such as symmetries or invariances to transfo...
research
06/14/2020

FenceMask: A Data Augmentation Approach for Pre-extracted Image Features

We propose a novel data augmentation method named 'FenceMask' that exhib...
research
10/24/2022

LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations

Data augmentations are important in training high-performance 3D object ...
research
03/25/2023

Thistle: A Vector Database in Rust

We present Thistle, a fully functional vector database. Thistle is an en...

Please sign up or login with your details

Forgot password? Click here to reset