A Hazard Analysis Framework for Code Synthesis Large Language Models

07/25/2022
by   Heidy Khlaaf, et al.
0

Codex, a large language model (LLM) trained on a variety of codebases, exceeds the previous state of the art in its capacity to synthesize and generate code. Although Codex provides a plethora of benefits, models that may generate code on such scale have significant limitations, alignment problems, the potential to be misused, and the possibility to increase the rate of progress in technical fields that may themselves have destabilizing impacts or have misuse potential. Yet such safety impacts are not yet known or remain to be explored. In this paper, we outline a hazard analysis framework constructed at OpenAI to uncover hazards or safety risks that the deployment of models like Codex may impose technically, socially, politically, and economically. The analysis is informed by a novel evaluation framework that determines the capacity of advanced code generation techniques against the complexity and expressivity of specification prompts, and their capability to understand and execute them relative to human ability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2023

CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility

With the rapid evolution of large language models (LLMs), there is a gro...
research
06/02/2023

Is Model Attention Aligned with Human Attention? An Empirical Study on Large Language Models for Code Generation

Large Language Models (LLMs) have been demonstrated effective for code g...
research
02/27/2023

The (ab)use of Open Source Code to Train Large Language Models

In recent years, Large Language Models (LLMs) have gained significant po...
research
05/18/2023

Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation

Code generation aims to automatically generate source code from high-lev...
research
01/23/2023

StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

Text-to-image synthesis has recently seen significant progress thanks to...
research
07/07/2021

Evaluating Large Language Models Trained on Code

We introduce Codex, a GPT language model fine-tuned on publicly availabl...
research
12/12/2022

Automated Variable Renaming: Are We There Yet?

Identifiers, such as method and variable names, form a large portion of ...

Please sign up or login with your details

Forgot password? Click here to reset