BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge

08/31/2023
by   Xiangru Tang, et al.
1

Pre-trained language models like ChatGPT have significantly improved code generation. As these models scale up, there is an increasing need for the output to handle more intricate tasks. Moreover, in bioinformatics, generating functional programs poses additional notable challenges due to the amount of domain knowledge, the need for complicated data operations, and intricate functional dependencies between the operations. Here, we present BioCoder, a benchmark developed to evaluate existing pre-trained models in generating bioinformatics code. In relation to function-code generation, BioCoder covers potential package dependencies, class declarations, and global variables. It incorporates 1026 functions and 1243 methods in Python and Java from GitHub and 253 examples from the Rosalind Project. BioCoder incorporates a fuzz-testing framework for evaluation, and we have applied it to evaluate many models including InCoder, CodeGen, CodeGen2, SantaCoder, StarCoder, StarCoder+, InstructCodeT5+, and ChatGPT. Our detailed analysis of these models emphasizes the importance of domain knowledge, pragmatic code generation, and contextual understanding. Our dataset, benchmark, Docker images, and scripts required for testing are all available at https://github.com/gersteinlab/biocoder.

READ FULL TEXT

page 17

page 18

page 19

page 21

page 22

page 23

page 24

page 25

research
02/01/2023

CoderEval: A Benchmark of Pragmatic Code Generation with Generative Pre-trained Models

Code generation models based on the pre-training and fine-tuning paradig...
research
08/03/2023

ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation

In this work, we make the first attempt to evaluate LLMs in a more chall...
research
05/08/2023

Code Execution with Pre-trained Language Models

Code execution is a fundamental aspect of programming language semantics...
research
06/06/2023

TwistList: Resources and Baselines for Tongue Twister Generation

Previous work in phonetically-grounded language generation has mainly fo...
research
07/07/2021

Evaluating Large Language Models Trained on Code

We introduce Codex, a GPT language model fine-tuned on publicly availabl...
research
01/09/2023

SantaCoder: don't reach for the stars!

The BigCode project is an open-scientific collaboration working on the r...
research
10/15/2021

Benchmark Problems for CEC2021 Competition on Evolutionary Transfer Multiobjectve Optimization

Evolutionary transfer multiobjective optimization (ETMO) has been becomi...

Please sign up or login with your details

Forgot password? Click here to reset