RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems

06/05/2023
by   Tianyang Liu, et al.
0

Large Language Models (LLMs) have greatly advanced code auto-completion systems, with a potential for substantial productivity enhancements for developers. However, current benchmarks mainly focus on single-file tasks, leaving an assessment gap for more complex, real-world, multi-file programming scenarios. To fill this gap, we introduce RepoBench, a new benchmark specifically designed for evaluating repository-level code auto-completion systems. RepoBench consists of three interconnected evaluation tasks: RepoBench-R (Retrieval), RepoBench-C (Code Completion), and RepoBench-P (Pipeline). Each task respectively measures the system's ability to retrieve the most relevant code snippets from other files as cross-file context, predict the next line of code with cross-file and in-file context, and handle complex tasks that require a combination of both retrieval and next-line prediction. RepoBench aims to facilitate a more complete comparison of performance and encouraging continuous improvement in auto-completion systems. RepoBench is publicly available at https://github.com/Leolty/repobench.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2022

CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context

While pre-trained language models (LM) for code have achieved great succ...
research
03/22/2023

RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation

The task of repository-level code completion is to continue writing the ...
research
06/26/2022

Repository-Level Prompt Generation for Large Language Models of Code

With the success of large language models (LLMs) of code and their use a...
research
03/15/2022

ReACC: A Retrieval-Augmented Code Completion Framework

Code completion, which aims to predict the following code token(s) accor...
research
03/26/2020

On-the-Fly Adaptation of Source Code Models using Meta-Learning

The ability to adapt to unseen, local contexts is an important challenge...
research
08/19/2022

Topical: Learning Repository Embeddings from Source Code using Attention

Machine learning on source code (MLOnCode) promises to transform how sof...
research
06/19/2023

RepoFusion: Training Code Models to Understand Your Repository

Despite the huge success of Large Language Models (LLMs) in coding assis...

Please sign up or login with your details

Forgot password? Click here to reset