Recurrent Neural Network Language Models for Open Vocabulary Event-Level Cyber Anomaly Detection

12/02/2017
by   Aaron Tuor, et al.
0

Automated analysis methods are crucial aids for monitoring and defending a network to protect the sensitive or confidential data it hosts. This work introduces a flexible, powerful, and unsupervised approach to detecting anomalous behavior in computer and network logs, one that largely eliminates domain-dependent feature engineering employed by existing methods. By treating system logs as threads of interleaved "sentences" (event log lines) to train online unsupervised neural network language models, our approach provides an adaptive model of normal network behavior. We compare the effectiveness of both standard and bidirectional recurrent neural network language models at detecting malicious activity within network log data. Extending these models, we introduce a tiered recurrent architecture, which provides context by modeling sequences of users' actions over time. Compared to Isolation Forest and Principal Components Analysis, two popular anomaly detection algorithms, we observe superior performance on the Los Alamos National Laboratory Cyber Security dataset. For log-line-level red team detection, our best performing character-based model provides test set area under the receiver operator characteristic curve of 0.98, demonstrating the strong fine-grained anomaly detection performance of this approach on open vocabulary logging sources.

READ FULL TEXT
research
03/13/2018

Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection

Deep learning has recently demonstrated state-of-the art performance on ...
research
03/29/2021

Dynamically Modelling Heterogeneous Higher-Order Interactions for Malicious Behavior Detection in Event Logs

Anomaly detection in event logs is a promising approach for intrusion de...
research
10/02/2017

Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams

Analysis of an organization's computer network activity is a key compone...
research
06/07/2023

IsoEx: an explainable unsupervised approach to process event logs cyber investigation

39 seconds. That is the timelapse between two consecutive cyber attacks ...
research
06/07/2020

Hybrid Model for Anomaly Detection on Call Detail Records by Time Series Forecasting

Mobile network operators store an enormous amount of information like lo...
research
11/18/2021

LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model

The system log generated in a computer system refers to large-scale data...
research
08/27/2021

End-To-End Anomaly Detection for Identifying Malicious Cyber Behavior through NLP-Based Log Embeddings

Rule-based IDS (intrusion detection systems) are being replaced by more ...

Please sign up or login with your details

Forgot password? Click here to reset