A Directed Acyclic Graph Approach to Online Log Parsing

06/12/2018
by   Pinjia He, et al.
0

Logs are widely used in modern software system management because they are often the only data accessible that record system events at runtime. In recent years, because of the ever-increasing log size, data mining techniques are often utilized to help developers and operators conduct system reliability management. A typical log-based system reliability management procedure is to first parse log messages because of their unstructured format; and apply data mining techniques on the parsed logs to obtain critical system behavior information. Most of existing research studies focus on offline log parsing, which need to parse logs in batch mode. However, software systems, especially distributed systems, require online monitoring and maintenance. Thus, a log parser that can parse log messages in a streaming manner is highly in demand. To address this problem, we propose an online log parsing method, namely Drain, based on directed acyclic graph, which encodes specially designed rules for parsing. Drain can automatically generate a directed acyclic graph for a new system and update the graph according to the incoming log messages. Besides, Drain frees developers from the burden of parameter tuning by allowing them use Drain with no pre-defined parameters. To evaluate the performance of Drain, we collect 11 log datasets generated by real-world systems, ranging from distributed systems, Web applications, supercomputers, operating systems, to standalone software. The experimental results show that Drain has the highest accuracy on all 11 datasets. Moreover, Drain obtains 37.15%∼ 97.14% improvement in the running time over the state-of-the-art online parsers. We also conduct a case study on a log-based anomaly detection task using Drain in the parsing step, which determines its effectiveness in system reliability management.

READ FULL TEXT
research
04/24/2023

USTEP: Structuration des logs en flux grâce à un arbre de recherche évolutif

Logs record valuable system information at runtime. They are widely used...
research
11/08/2018

Tools and Benchmarks for Automated Log Parsing

Logs are imperative in the development and maintenance process of many s...
research
08/10/2022

LogStamp: Automatic Online Log Parsing Based on Sequence Labelling

Logs are one of the most critical data for service management. It contai...
research
03/17/2020

Self-Supervised Log Parsing

Logs are extensively used during the development and maintenance of soft...
research
08/21/2023

A Large-scale Benchmark for Log Parsing

Log data is pivotal in activities like anomaly detection and failure dia...
research
12/23/2021

SemParser: A Semantic Parser for Log Analysis

Logs, being run-time information automatically generated by software, re...
research
02/12/2021

On Automatic Parsing of Log Records

Software log analysis helps to maintain the health of software solutions...

Please sign up or login with your details

Forgot password? Click here to reset