gambit – An Open Source Name Disambiguation Tool for Version Control Systems

03/09/2021
by   Christoph Gote, et al.
0

Name disambiguation is a complex but highly relevant challenge whenever analysing real-world user data, such as data from version control systems. We propose gambit, a rule-based disambiguation tool that only relies on name and email information. We evaluate its performance against two commonly used algorithms with similar characteristics on manually disambiguated ground-truth data from the Gnome GTK project. Our results show that gambit significantly outperforms both algorithms, achieving an F1 score of 0.985.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2022

A Pipeline for DNS-Based Software Fingerprinting

In this paper, we present the modular design and implementation of DONUT...
research
05/04/2023

A Study of Static Warning Cascading Tools (Experience Paper)

Static analysis is widely used for software assurance. However, static a...
research
11/30/2021

Automatic tracing of mandibular canal pathways using deep learning

There is an increasing demand in medical industries to have automated sy...
research
11/09/2020

Chapter Captor: Text Segmentation in Novels

Books are typically segmented into chapters and sections, representing c...
research
10/08/2018

An AMR Aligner Tuned by Transition-based Parser

In this paper, we propose a new rich resource enhanced AMR aligner which...
research
08/30/2022

Swin-transformer-yolov5 For Real-time Wine Grape Bunch Detection

In this research, an integrated detection model, Swin-transformer-YOLOv5...
research
10/16/2014

The HAWKwood Database

We present a database consisting of wood pile images, which can be used ...

Please sign up or login with your details

Forgot password? Click here to reset