LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision

by   Thorsten Wittkopp, et al.

With increasing scale and complexity of cloud operations, automated detection of anomalies in monitoring data such as logs will be an essential part of managing future IT infrastructures. However, many methods based on artificial intelligence, such as supervised deep learning models, require large amounts of labeled training data to perform well. In practice, this data is rarely available because labeling log data is expensive, time-consuming, and requires a deep understanding of the underlying system. We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts. Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect. It is based on the attention mechanism and uses a custom objective function for weak supervision deep learning techniques that accounts for imbalanced data. Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.



There are no comments yet.


page 1

page 2

page 3

page 4


Log-based Anomaly Detection with Deep Learning: How Far Are We?

Software-intensive systems produce logs for troubleshooting purposes. Re...

Data Consistency for Weakly Supervised Learning

In many applications, training machine learning models involves using la...

A Weak Supervision Approach to Detecting Visual Anomalies for Automated Testing of Graphics Units

We present a deep learning system for testing graphics units by detectin...

Active Learning with Weak Supervision for Cost-Effective Panicle Detection in Cereal Crops

Panicle density of cereal crops such as wheat and sorghum is one of the ...

Data Programming: Creating Large Training Sets, Quickly

Large labeled training sets are the critical building blocks of supervis...

Failure Identification from Unstable Log Data using Deep Learning

The reliability of cloud platforms is of significant relevance because s...

Deep SNP: An End-to-end Deep Neural Network with Attention-based Localization for Break-point Detection in SNP Array Genomic data

Diagnosis and risk stratification of cancer and many other diseases requ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.