A Large-scale Benchmark for Log Parsing

08/21/2023
by   Zhihan Jiang, et al.
0

Log data is pivotal in activities like anomaly detection and failure diagnosis in the automated maintenance of software systems. Due to their unstructured format, log parsing is often required to transform them into a structured format for automated analysis. A variety of log parsers exist, making it vital to benchmark these tools to comprehend their features and performance. However, existing datasets for log parsing are limited in terms of scale and representativeness, posing challenges for studies that aim to evaluate or develop log parsers. This problem becomes more pronounced when these parsers are evaluated for production use. To address these issues, we introduce a new collection of large-scale annotated log datasets, named LogPub, which more accurately mirrors log data observed in real-world software systems. LogPub comprises 14 datasets, each averaging 3.6 million log lines. Utilizing LogPub, we re-evaluate 15 log parsers in a more rigorous and practical setting. We also propose a new evaluation metric to lessen the sensitivity of current metrics to imbalanced data distribution. Furthermore, we are the first to scrutinize the detailed performance of log parsers on logs that represent rare system events and offer comprehensive information for system troubleshooting. Parsing such logs accurately is vital yet challenging. We believe that our work could shed light on the design and evaluation of log parsers in more realistic settings, thereby facilitating their implementation in production systems.

READ FULL TEXT

page 9

page 10

research
11/08/2018

Tools and Benchmarks for Automated Log Parsing

Logs are imperative in the development and maintenance process of many s...
research
06/02/2023

An Evaluation of Log Parsing with ChatGPT

Software logs play an essential role in ensuring the reliability and mai...
research
03/17/2020

Self-Supervised Log Parsing

Logs are extensively used during the development and maintenance of soft...
research
06/02/2023

EvLog: Evolving Log Analyzer for Anomalous Logs Identification

Software logs record system activities, aiding maintainers in identifyin...
research
02/12/2021

On Automatic Parsing of Log Records

Software log analysis helps to maintain the health of software solutions...
research
08/15/2023

LogPrompt: Prompt Engineering Towards Zero-Shot and Interpretable Log Analysis

Automated log analysis is crucial in modern software-intensive systems f...
research
06/12/2018

A Directed Acyclic Graph Approach to Online Log Parsing

Logs are widely used in modern software system management because they a...

Please sign up or login with your details

Forgot password? Click here to reset