Adding guardrails to advanced chatbots

06/13/2023
by   Yanchen Wang, et al.
0

Generative AI models continue to become more powerful. The launch of ChatGPT in November 2022 has ushered in a new era of AI. ChatGPT and other similar chatbots have a range of capabilities, from answering student homework questions to creating music and art. There are already concerns that humans may be replaced by chatbots for a variety of jobs. Because of the wide spectrum of data chatbots are built on, we know that they will have human errors and human biases built into them. These biases may cause significant harm and/or inequity toward different subpopulations. To understand the strengths and weakness of chatbot responses, we present a position paper that explores different use cases of ChatGPT to determine the types of questions that are answered fairly and the types that still need improvement. We find that ChatGPT is a fair search engine for the tasks we tested; however, it has biases on both text generation and code generation. We find that ChatGPT is very sensitive to changes in the prompt, where small changes lead to different levels of fairness. This suggests that we need to immediately implement "corrections" or mitigation strategies in order to improve fairness of these systems. We suggest different strategies to improve chatbots and also advocate for an impartial review panel that has access to the model parameters to measure the levels of different types of biases and then recommends safeguards that move toward responses that are less discriminatory and more accurate.

READ FULL TEXT

page 3

page 5

page 6

page 10

page 11

research
02/22/2022

Speciesist bias in AI – How AI applications perpetuate discrimination and unfair outcomes against animals

Massive efforts are made to reduce biases in both data and algorithms in...
research
02/02/2023

Out of Context: Investigating the Bias and Fairness Concerns of "Artificial Intelligence as a Service"

"AI as a Service" (AIaaS) is a rapidly growing market, offering various ...
research
08/08/2023

Unmasking Nationality Bias: A Study of Human Perception of Nationalities in AI-Generated Articles

We investigate the potential for nationality biases in natural language ...
research
02/07/2023

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness

Generative AI models have recently achieved astonishing results in quali...
research
07/11/2023

The Butterfly Effect in AI Fairness and Bias

The Butterfly Effect, a concept originating from chaos theory, underscor...
research
10/16/2021

How Well Do You Know Your Audience? Reader-aware Question Generation

When writing, a person may need to anticipate questions from their reade...
research
10/02/2022

Risk-graded Safety for Handling Medical Queries in Conversational AI

Conversational AI systems can engage in unsafe behaviour when handling u...

Please sign up or login with your details

Forgot password? Click here to reset