Better Modeling the Programming World with Code Concept Graphs-augmented Multi-modal Learning

01/10/2022
by   Martin Weyssow, et al.
0

The progress made in code modeling has been tremendous in recent years thanks to the design of natural language processing learning approaches based on state-of-the-art model architectures. Nevertheless, we believe that the current state-of-the-art does not focus enough on the full potential that data may bring to a learning process in software engineering. Our vision articulates on the idea of leveraging multi-modal learning approaches to modeling the programming world. In this paper, we investigate one of the underlying idea of our vision whose objective based on concept graphs of identifiers aims at leveraging high-level relationships between domain concepts manipulated through particular language constructs. In particular, we propose to enhance an existing pretrained language model of code by joint-learning it with a graph neural network based on our concept graphs. We conducted a preliminary evaluation that shows gain of effectiveness of the models for code search using a simple joint-learning method and prompts us to further investigate our research vision.

READ FULL TEXT
research
10/20/2021

JavaBERT: Training a transformer-based model for the Java programming language

Code quality is and will be a crucial factor while developing new softwa...
research
05/08/2023

A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues

Conditional inference on joint textual and visual clues is a multi-modal...
research
07/10/2023

Multi-modal Graph Learning over UMLS Knowledge Graphs

Clinicians are increasingly looking towards machine learning to gain ins...
research
05/28/2022

Contrastive Learning for Multi-Modal Automatic Code Review

Automatic code review (ACR), aiming to relieve manual inspection costs, ...
research
12/20/2022

A Survey on Pretrained Language Models for Neural Code Intelligence

As the complexity of modern software continues to escalate, software eng...
research
12/22/2021

The Importance of the Current Input in Sequence Modeling

The last advances in sequence modeling are mainly based on deep learning...
research
11/11/2022

DeepG2P: Fusing Multi-Modal Data to Improve Crop Production

Agriculture is at the heart of the solution to achieve sustainability in...

Please sign up or login with your details

Forgot password? Click here to reset