Prompting with Pseudo-Code Instructions

05/19/2023
by   Mayank Mishra, et al.
0

Prompting with natural language instructions has recently emerged as a popular method of harnessing the capabilities of large language models. Given the inherent ambiguity present in natural language, it is intuitive to consider the possible advantages of prompting with less ambiguous prompt styles, such as the use of pseudo-code. In this paper we explore if prompting via pseudo-code instructions helps improve the performance of pre-trained language models. We manually create a dataset of pseudo-code prompts for 132 different tasks spanning classification, QA and generative language tasks, sourced from the Super-NaturalInstructions dataset. Using these prompts along with their counterparts in natural language, we study their performance on two LLM families - BLOOM and CodeGen. Our experiments show that using pseudo-code instructions leads to better results, with an average increase (absolute) of 7-16 points in F1 scores for classification tasks and an improvement (relative) of 12-38 ROUGE-L scores across all tasks. We include detailed ablation studies which indicate that code comments, docstrings, and the structural clues encoded in pseudo-code all contribute towards the improvement in performance. To the best of our knowledge our work is the first to demonstrate how pseudo-code prompts can be helpful in improving the performance of pre-trained LMs.

READ FULL TEXT
research
04/26/2023

Exploring the Curious Case of Code Prompts

Recent work has shown that prompting language models with code-like repr...
research
04/16/2023

Automated Program Repair Based on Code Review: How do Pre-trained Transformer Models Perform?

Sequence-to-sequence models have been used to transform erroneous progra...
research
04/12/2023

LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity

Cross-task generalization is a significant outcome that defines mastery ...
research
09/15/2021

Can Machines Read Coding Manuals Yet? – A Benchmark for Building Better Language Models for Code Understanding

Code understanding is an increasingly important application of Artificia...
research
06/02/2022

Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code

Few-shot learning with large-scale, pre-trained language models is a pow...
research
07/20/2023

Multi-Method Self-Training: Improving Code Generation With Text, And Vice Versa

Large Language Models have many methods for solving the same problem. Th...
research
12/20/2022

Task Ambiguity in Humans and Language Models

Language models have recently achieved strong performance across a wide ...

Please sign up or login with your details

Forgot password? Click here to reset