Automatic Generation of Machine Learning Synthetic Data Using ROS

06/08/2021
by   Kyle M. Hart, et al.
0

Data labeling is a time intensive process. As such, many data scientists use various tools to aid in the data generation and labeling process. While these tools help automate labeling, many still require user interaction throughout the process. Additionally, most target only a few network frameworks. Any researchers exploring multiple frameworks must find additional tools orwrite conversion scripts. This paper presents an automated tool for generating synthetic data in arbitrary network formats. It uses Robot Operating System (ROS) and Gazebo, which are common tools in the robotics community. Through ROS paradigms, it allows extensive user customization of the simulation environment and data generation process. Additionally, a plugin-like framework allows the development of arbitrary data format writers without the need to change the main body of code. Using this tool, the authors were able to generate an arbitrarily large image dataset for three unique training formats using approximately 15 min of user setup time and a variable amount of hands-off run time, depending on the dataset size. The source code for this data generation tool is available at https://github.com/Navy-RISE-Lab/nn_data_collection

READ FULL TEXT
research
08/04/2023

Interoperable synthetic health data with SyntHIR to enable the development of CDSS tools

There is a great opportunity to use high-quality patient journals and he...
research
05/09/2023

Novel Synthetic Data Tool for Data-Driven Cardboard Box Localization

Application of neural networks in industrial settings, such as automated...
research
10/24/2017

Synthetic Data for Social Good

Data for good implies unfettered access to data. But data owners must be...
research
08/02/2021

DepRes: A Tool for Resolving Fully Qualified Names and Their Dependencies

Reusing code snippets shared by other programmers on Q A forums (e.g.,...
research
07/10/2023

InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval

Recent work has explored Large Language Models (LLMs) to overcome the la...
research
03/27/2007

Automatic Generation of Benchmarks for Plagiarism Detection Tools using Grammatical Evolution

This paper has been withdrawn by the authors due to a major rewriting....
research
10/17/2022

System-Specific Interpreters Make Megasystems Friendlier

Modern operating systems, browsers, and office suites have become megasy...

Please sign up or login with your details

Forgot password? Click here to reset