Self-Supervised Log Parsing

03/17/2020
by   Sasho Nedelkoski, et al.
0

Logs are extensively used during the development and maintenance of software systems. They collect runtime events and allow tracking of code execution, which enables a variety of critical tasks such as troubleshooting and fault detection. However, large-scale software systems generate massive volumes of semi-structured log records, posing a major challenge for automated analysis. Parsing semi-structured records with free-form text log messages into structured templates is the first and crucial step that enables further analysis. Existing approaches rely on log-specific heuristics or manual rule extraction. These are often specialized in parsing certain log types, and thus, limit performance scores and generalization. We propose a novel parsing technique called NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling (MLM). In the process of parsing, the model extracts summarizations from the logs in the form of a vector embedding. This allows the coupling of the MLM as pre-training with a downstream anomaly detection task. We evaluate the parsing performance of NuLog on 10 real-world log datasets and compare the results with 12 parsing techniques. The results show that NuLog outperforms existing methods in parsing accuracy with an average of 99 ground truth templates. Additionally, two case studies are conducted to demonstrate the ability of the approach for log-based anomaly detection in both supervised and unsupervised scenario. The results show that NuLog can be successfully used to support troubleshooting tasks. The implementation is available at https://github.com/nulog/nulog.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2023

Impact of Log Parsing on Log-based Anomaly Detection

Software systems log massive amounts of data, recording important runtim...
research
08/04/2021

Log-based Anomaly Detection Without Log Parsing

Software systems often record important runtime information in system lo...
research
08/15/2023

LogPrompt: Prompt Engineering Towards Zero-Shot and Interpretable Log Analysis

Automated log analysis is crucial in modern software-intensive systems f...
research
08/21/2023

A Large-scale Benchmark for Log Parsing

Log data is pivotal in activities like anomaly detection and failure dia...
research
07/19/2023

Prompting for Automatic Log Template Extraction

Log parsing, the initial and vital stage in automated log analysis, invo...
research
06/12/2018

A Directed Acyclic Graph Approach to Online Log Parsing

Logs are widely used in modern software system management because they a...
research
10/29/2021

AWSOM-LP: An Effective Log Parsing Technique Using Pattern Recognition and Frequency Analysis

Logs provide users with useful insights to help with a variety of develo...

Please sign up or login with your details

Forgot password? Click here to reset