Diet Code is Healthy: Simplifying Programs for Pre-Trained Models of Code

06/29/2022
by   Zhaowei Zhang, et al.
0

Pre-trained code representation models such as CodeBERT have demonstrated superior performance in a variety of software engineering tasks, yet they are often heavy in complexity, quadratically with the length of the input sequence. Our empirical analysis of CodeBERT's attention reveals that CodeBERT pays more attention to certain types of tokens and statements such as keywords and data-relevant statements. Based on these findings, we propose DietCodeBERT, which aims at lightweight leverage of large pre-trained models for source code. DietCodeBERT simplifies the input program of CodeBERT with three strategies, namely, word dropout, frequency filtering, and an attention-based strategy which selects statements and tokens that receive the most attention weights during pre-training. Hence, it gives a substantial reduction in the computational cost without hampering the model performance. Experimental results on two downstream tasks show that DietCodeBERT provides comparable results to CodeBERT with 40 testing.

READ FULL TEXT
research
02/08/2023

An Empirical Comparison of Pre-Trained Models of Source Code

While a large number of pre-trained models of source code have been succ...
research
10/11/2022

Extracting Meaningful Attention on Source Code: An Empirical Study of Developer and Neural Model Code Exploration

The high effectiveness of neural models of code, such as OpenAI Codex an...
research
07/27/2023

Scaling TransNormer to 175 Billion Parameters

We present TransNormerLLM, the first linear attention-based Large Langua...
research
10/06/2022

XDoc: Unified Pre-training for Cross-Format Document Understanding

The surge of pre-training has witnessed the rapid development of documen...
research
06/26/2023

LongCoder: A Long-Range Pre-trained Language Model for Code Completion

In this paper, we introduce a new task for code completion that focuses ...
research
09/09/2023

FAIR: Flow Type-Aware Pre-Training of Compiler Intermediate Representations

While the majority of existing pre-trained models from code learn source...
research
03/16/2022

AdapLeR: Speeding up Inference by Adaptive Length Reduction

Pre-trained language models have shown stellar performance in various do...

Please sign up or login with your details

Forgot password? Click here to reset