Data-Driven AI Model Signal-Awareness Enhancement and Introspection

11/10/2021
by   Sahil Suneja, et al.
0

AI modeling for source code understanding tasks has been making significant progress, and is being adopted in production development pipelines. However, reliability concerns, especially whether the models are actually learning task-related aspects of source code, are being raised. While recent model-probing approaches have observed a lack of signal awareness in many AI-for-code models, i.e. models not capturing task-relevant signals, they do not offer solutions to rectify this problem. In this paper, we explore data-driven approaches to enhance models' signal-awareness: 1) we combine the SE concept of code complexity with the AI technique of curriculum learning; 2) we incorporate SE assistance into AI models by customizing Delta Debugging to generate simplified signal-preserving programs, augmenting them to the training dataset. With our techniques, we achieve up to 4.8x improvement in model signal awareness. Using the notion of code complexity, we further present a novel model learning introspection approach from the perspective of the dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2020

Probing Model Signal-Awareness via Prediction-Preserving Input Minimization

This work explores the signal awareness of AI models for source code und...
research
04/15/2022

AI-driven Development Is Here: Should You Worry?

AI-Driven Development Environments (AIDEs) Integrate the power of modern...
research
04/19/2023

How Secure is Code Generated by ChatGPT?

In recent years, large language models have been responsible for great a...
research
02/05/2019

A Neural Model for Generating Natural Language Summaries of Program Subroutines

Source code summarization -- creating natural language descriptions of s...
research
07/22/2022

Near Real-Time Distributed State Estimation via AI/ML-Empowered 5G Networks

Fifth-Generation (5G) networks have a potential to accelerate power syst...
research
10/27/2022

Audio Signal Enhancement with Learning from Positive and Unlabelled Data

Supervised learning is a mainstream approach to audio signal enhancement...

Please sign up or login with your details

Forgot password? Click here to reset