An Exploratory Study of Log Placement Recommendation in an Enterprise System

03/02/2021
by   Jeanderson Candido, et al.
0

Logging is a development practice that plays an important role in the operations and monitoring of complex systems. Developers place log statements in the source code and use log data to understand how the system behaves in production. Unfortunately, anticipating where to log during development is challenging. Previous studies show the feasibility of leveraging machine learning to recommend log placement despite the data imbalance since logging is a fraction of the overall code base. However, it remains unknown how those techniques apply to an industry setting, and little is known about the effect of imbalanced data and sampling techniques. In this paper, we study the log placement problem in the code base of Adyen, a large-scale payment company. We analyze 34,526 Java files and 309,527 methods that sum up +2M SLOC. We systematically measure the effectiveness of five models based on code metrics, explore the effect of sampling techniques, understand which features models consider to be relevant for the prediction, and evaluate whether we can exploit 388,086 methods from 29 Apache projects to learn where to log in an industry setting. Our best performing model achieves 79 precision, 60 penalize precision at a prohibitive cost. Experiments with open-source data yield under-performing models over Adyen's test set; nevertheless, they are useful due to their low rate of false positives. Our supporting scripts and tools are available to the community.

READ FULL TEXT
research
09/08/2021

Leveraging Code Clones and Natural Language Processing for Log Statement Prediction

Software developers embed logging statements inside the source code as a...
research
08/13/2022

The Sense of Logging in the Linux Kernel

Logging plays a crucial role in software engineering because it is key t...
research
12/02/2021

Borrowing from Similar Code: A Deep Learning NLP-Based Approach for Log Statement Automation

Software developers embed logging statements inside the source code as a...
research
08/18/2023

Test Code Refactoring Unveiled: Where and How Does It Affect Test Code Quality and Effectiveness?

Context. Refactoring has been widely investigated in the past in relatio...
research
01/13/2022

Using Deep Learning to Generate Complete Log Statements

Logging is a practice widely adopted in several phases of the software l...
research
10/08/2018

A Note On the Size of Largest Bins Using Placement With Linear Transformations

We study the placement of n balls into n bins where balls and bins are r...

Please sign up or login with your details

Forgot password? Click here to reset