Lightweight Multilingual Software Analysis

08/03/2018
by   Damian M. Lyons, et al.
0

Developer preferences, language capabilities and the persistence of older languages contribute to the trend that large software codebases are often multilingual, that is, written in more than one computer language. While developers can leverage monolingual software development tools to build software components, companies are faced with the problem of managing the resultant large, multilingual codebases to address issues with security, efficiency, and quality metrics. The key challenge is to address the opaque nature of the language interoperability interface: one language calling procedures in a second (which may call a third, or even back to the first), resulting in a potentially tangled, inefficient and insecure codebase. An architecture is proposed for lightweight static analysis of large multilingual codebases: the MLSA architecture. Its modular and table-oriented structure addresses the open-ended nature of multiple languages and language interoperability APIs. We focus here as an application on the construction of call-graphs that capture both inter-language and intra-language calls. The algorithms for extracting multilingual call-graphs from codebases are presented, and several examples of multilingual software engineering analysis are discussed. The state of the implementation and testing of MLSA is presented, and the implications for future work are discussed.

READ FULL TEXT
research
08/03/2018

Lightweight Call-Graph Construction for Multilingual Software Analysis

Analysis of multilingual codebases is a topic of increasing importance. ...
research
06/19/2019

Towards Lakosian Multilingual Software Design Principles

Large software systems often comprise programs written in different prog...
research
06/04/2023

A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models

Polyglot is a pioneering project aimed at enhancing the non-English lang...
research
10/09/2019

Is Multilingual BERT Fluent in Language Generation?

The multilingual BERT model is trained on 104 languages and meant to ser...
research
03/18/2022

Do Multilingual Language Models Capture Differing Moral Norms?

Massively multilingual sentence representations are trained on large cor...
research
06/23/2018

Probabilistic Software Modeling

Software Engineering and the implementation of software has become a cha...
research
12/03/2021

Multilingual training for Software Engineering

Well-trained machine-learning models, which leverage large amounts of op...

Please sign up or login with your details

Forgot password? Click here to reset