Traceability Support for Multi-Lingual Software Projects

06/30/2020
by   Yalin Liu, et al.
0

Software traceability establishes associations between diverse software artifacts such as requirements, design, code, and test cases. Due to the non-trivial costs of manually creating and maintaining links, many researchers have proposed automated approaches based on information retrieval techniques. However, many globally distributed software projects produce software artifacts written in two or more languages. The use of intermingled languages reduces the efficacy of automated tracing solutions. In this paper, we first analyze and discuss patterns of intermingled language use across multiple projects, and then evaluate several different tracing algorithms including the Vector Space Model (VSM), Latent Semantic Indexing (LSI), Latent Dirichlet Allocation (LDA), and various models that combine mono- and cross-lingual word embeddings with the Generative Vector Space Model (GVSM). Based on an analysis of 14 Chinese-English projects, our results show that best performance is achieved using mono-lingual word embeddings integrated into GVSM with machine translation as a preprocessing step.

READ FULL TEXT
research
05/15/2020

Cross-lingual Transfer of Twitter Sentiment Models Using a Common Vector Space

Word embeddings represent words in a numeric space in such a way that se...
research
04/06/2018

Semantically Enhanced Software Traceability Using Deep Learning Techniques

In most safety-critical domains the need for traceability is prescribed ...
research
07/21/2017

Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation

Existing approaches to automatic VerbNet-style verb classification are h...
research
10/04/2018

Neural Networks for Cross-lingual Negation Scope Detection

Negation scope has been annotated in several English and Chinese corpora...
research
12/10/2019

Machine Translation with Cross-lingual Word Embeddings

Learning word embeddings using distributional information is a task that...
research
11/26/2020

Unsupervised Word Translation Pairing using Refinement based Point Set Registration

Cross-lingual alignment of word embeddings play an important role in kno...

Please sign up or login with your details

Forgot password? Click here to reset