Language Models Can Teach Themselves to Program Better

07/29/2022
by   Patrick Haluptzok, et al.
0

This work shows how one can use large-scale language models (LMs) to synthesize programming problems with verified solutions, in the form of programming puzzles, which can then in turn be used to fine-tune those same models, improving their performance. This work builds on two recent developments. First, LMs have achieved breakthroughs in non-trivial reasoning and algorithm implementation, generating code that can solve some intermediate-level competitive programming problems. However, training code LMs involves curated sets of natural-language problem descriptions and source-code tests and solutions, which are limited in size. Second, a new format of programming challenge called a programming puzzle was introduced, which does not require a natural language description and is directly specified by a source-code test. In this work we show how generating synthetic programming puzzles and solutions, verified for correctness by a Python interpreter, can be used to improve performance in solving test puzzles from P3, a public benchmark set of Python Programming Puzzles. Additionally, we release a dataset of 1 million puzzles and solutions generated by the Codex model, which we show can improve smaller models through fine-tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2021

Program Synthesis with Large Language Models

This paper explores the limits of the current generation of large langua...
research
12/06/2022

Codex Hacks HackerRank: Memorization Issues and a Framework for Code Synthesis Evaluation

The Codex model has demonstrated extraordinary competence in synthesizin...
research
02/08/2022

Competition-Level Code Generation with AlphaCode

Programming is a powerful and ubiquitous problem-solving tool. Developin...
research
10/26/2022

Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic?

Language models are promising solutions for tackling increasing complex ...
research
05/20/2021

Measuring Coding Challenge Competence With APPS

While programming is one of the most broadly applicable skills in modern...
research
02/15/2023

Learning Performance-Improving Code Edits

The waning of Moore's Law has shifted the focus of the tech industry tow...
research
10/27/2022

Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language

GitHub Copilot is an artificial intelligence model for automatically gen...

Please sign up or login with your details

Forgot password? Click here to reset