DeepAI AI Chat
Log In Sign Up

MISIM: An End-to-End Neural Code Similarity System

06/05/2020
by   Fangke Ye, et al.
MIT
Georgia Institute of Technology
Intel
6

Code similarity systems are integral to a range of applications from code recommendation to automated construction of software tests and defect mitigation. In this paper, we present Machine Inferred Code Similarity (MISIM), a novel end-to-end code similarity system that consists of two core components. First, MISIM uses a novel context-aware similarity structure, which is designed to aid in lifting semantic meaning from code syntax. Second, MISIM provides a neural-based code similarity scoring system, which can be implemented with various neural network algorithms and topologies with learned parameters. We compare MISIM to three other state-of-the-art code similarity systems: (i) code2vec, (ii) Neural Code Comprehension, and (iii) Aroma. In our experimental evaluation across 45,780 programs, MISIM consistently outperformed all three systems, often by a large factor (upwards of 40.6x).

READ FULL TEXT
04/20/2018

Automatic Stance Detection Using End-to-End Memory Networks

We present a novel end-to-end memory network for stance detection, which...
03/24/2020

Context-Aware Parse Trees

The simplified parse tree (SPT) presented in Aroma, a state-of-the-art c...
11/28/2021

Code Clone Detection based on Event Embedding and Event Dependency

The code clone detection method based on semantic similarity has importa...
10/23/2018

Ain't Nobody Got Time For Coding: Structure-Aware Program Synthesis From Natural Language

Program synthesis from natural language (NL) is practical for humans and...
03/31/2023

Code Reviewer Recommendation for Architecture Violations: An Exploratory Study

Code review is a common practice in software development and often condu...
09/04/2021

A Neural Network-Based Linguistic Similarity Measure for Entrainment in Conversations

Linguistic entrainment is a phenomenon where people tend to mimic each o...
06/20/2020

The Lernaean Hydra of Data Series Similarity Search: An Experimental Evaluation of the State of the Art

Increasingly large data series collections are becoming commonplace acro...