MISIM: An End-to-End Neural Code Similarity System

06/05/2020
by   Fangke Ye, et al.
6

Code similarity systems are integral to a range of applications from code recommendation to automated construction of software tests and defect mitigation. In this paper, we present Machine Inferred Code Similarity (MISIM), a novel end-to-end code similarity system that consists of two core components. First, MISIM uses a novel context-aware similarity structure, which is designed to aid in lifting semantic meaning from code syntax. Second, MISIM provides a neural-based code similarity scoring system, which can be implemented with various neural network algorithms and topologies with learned parameters. We compare MISIM to three other state-of-the-art code similarity systems: (i) code2vec, (ii) Neural Code Comprehension, and (iii) Aroma. In our experimental evaluation across 45,780 programs, MISIM consistently outperformed all three systems, often by a large factor (upwards of 40.6x).

READ FULL TEXT
research
04/20/2018

Automatic Stance Detection Using End-to-End Memory Networks

We present a novel end-to-end memory network for stance detection, which...
research
03/24/2020

Context-Aware Parse Trees

The simplified parse tree (SPT) presented in Aroma, a state-of-the-art c...
research
11/28/2021

Code Clone Detection based on Event Embedding and Event Dependency

The code clone detection method based on semantic similarity has importa...
research
10/23/2018

Ain't Nobody Got Time For Coding: Structure-Aware Program Synthesis From Natural Language

Program synthesis from natural language (NL) is practical for humans and...
research
03/31/2023

Code Reviewer Recommendation for Architecture Violations: An Exploratory Study

Code review is a common practice in software development and often condu...
research
09/04/2021

A Neural Network-Based Linguistic Similarity Measure for Entrainment in Conversations

Linguistic entrainment is a phenomenon where people tend to mimic each o...
research
06/20/2020

The Lernaean Hydra of Data Series Similarity Search: An Experimental Evaluation of the State of the Art

Increasingly large data series collections are becoming commonplace acro...

Please sign up or login with your details

Forgot password? Click here to reset