Label Smoothing Improves Neural Source Code Summarization

03/28/2023
by   Sakib Haque, et al.
0

Label smoothing is a regularization technique for neural networks. Normally neural models are trained to an output distribution that is a vector with a single 1 for the correct prediction, and 0 for all other elements. Label smoothing converts the correct prediction location to something slightly less than 1, then distributes the remainder to the other elements such that they are slightly greater than 0. A conceptual explanation behind label smoothing is that it helps prevent a neural model from becoming "overconfident" by forcing it to consider alternatives, even if only slightly. Label smoothing has been shown to help several areas of language generation, yet typically requires considerable tuning and testing to achieve the optimal results. This tuning and testing has not been reported for neural source code summarization - a growing research area in software engineering that seeks to generate natural language descriptions of source code behavior. In this paper, we demonstrate the effect of label smoothing on several baselines in neural code summarization, and conduct an experiment to find good parameters for label smoothing and make recommendations for its use.

READ FULL TEXT

page 5

page 9

page 10

research
05/16/2023

Towards Modeling Human Attention from Eye Movements for Neural Source Code Summarization

Neural source code summarization is the task of generating natural langu...
research
04/06/2020

Improved Code Summarization via a Graph Neural Network

Automatic source code summarization is the task of generating natural la...
research
04/04/2022

Semantic Similarity Metrics for Evaluating Source Code Summarization

Source code summarization involves creating brief descriptions of source...
research
08/14/2023

Semantic Similarity Loss for Neural Source Code Summarization

This paper presents an improved loss function for neural source code sum...
research
03/22/2021

Project-Level Encoding for Neural Source Code Summarization of Subroutines

Source code summarization of a subroutine is the task of writing a short...
research
05/29/2021

CoDesc: A Large Code-Description Parallel Dataset

Translation between natural language and source code can help software d...
research
05/30/2021

Diversifying Dialog Generation via Adaptive Label Smoothing

Neural dialogue generation models trained with the one-hot target distri...

Please sign up or login with your details

Forgot password? Click here to reset