The Use of Synthetic Data to Train AI Models: Opportunities and Risks for Sustainable Development

08/31/2023
by   Tshilidzi Marwala, et al.
0

In the current data driven era, synthetic data, artificially generated data that resembles the characteristics of real world data without containing actual personal information, is gaining prominence. This is due to its potential to safeguard privacy, increase the availability of data for research, and reduce bias in machine learning models. This paper investigates the policies governing the creation, utilization, and dissemination of synthetic data. Synthetic data can be a powerful instrument for protecting the privacy of individuals, but it also presents challenges, such as ensuring its quality and authenticity. A well crafted synthetic data policy must strike a balance between privacy concerns and the utility of data, ensuring that it can be utilized effectively without compromising ethical or legal standards. Organizations and institutions must develop standardized guidelines and best practices in order to capitalize on the benefits of synthetic data while addressing its inherent challenges.

READ FULL TEXT
research
03/01/2023

What Is Synthetic Data? The Good, The Bad, and The Ugly

Sharing data can often enable compelling applications and analytics. How...
research
06/23/2023

Exploring the Potential of AI-Generated Synthetic Datasets: A Case Study on Telematics Data with ChatGPT

This research delves into the construction and utilization of synthetic ...
research
10/28/2021

Generating synthetic transactional profiles

Financial institutions use clients' payment transactions in numerous ban...
research
07/09/2023

On the Challenges of Deploying Privacy-Preserving Synthetic Data in the Enterprise

Generative AI technologies are gaining unprecedented popularity, causing...
research
03/22/2023

Man vs the machine: The Struggle for Effective Text Anonymisation in the Age of Large Language Models

The collection and use of personal data are becoming more common in toda...
research
05/03/2019

In Defense of Synthetic Data

Synthetic datasets have long been thought of as second-rate, to be used ...
research
04/13/2022

Enabling Synthetic Data adoption in regulated domains

The switch from a Model-Centric to a Data-Centric mindset is putting emp...

Please sign up or login with your details

Forgot password? Click here to reset