A Deep Belief Network Based Machine Learning System for Risky Host Detection

by   Wangyan Feng, et al.

To assure cyber security of an enterprise, typically SIEM (Security Information and Event Management) system is in place to normalize security event from different preventive technologies and flag alerts. Analysts in the security operation center (SOC) investigate the alerts to decide if it is truly malicious or not. However, generally the number of alerts is overwhelming with majority of them being false positive and exceeding the SOC's capacity to handle all alerts. There is a great need to reduce the false positive rate as much as possible. While most previous research focused on network intrusion detection, we focus on risk detection and propose an intelligent Deep Belief Network machine learning system. The system leverages alert information, various security logs and analysts' investigation results in a real enterprise environment to flag hosts that have high likelihood of being compromised. Text mining and graph based method are used to generate targets and create features for machine learning. In the experiment, Deep Belief Network is compared with other machine learning algorithms, including multi-layer neural network, random forest, support vector machine and logistic regression. Results on real enterprise data indicate that the deep belief network machine learning system performs better than other algorithms for our problem and is six times more effective than current rule-based system. We also implement the whole system from data collection, label creation, feature engineering to host score generation in a real enterprise production environment.



page 1


Performance Comparison of Intrusion Detection Systems and Application of Machine Learning to Snort System

This study investigates the performance of two open source intrusion det...

Evaluation of Machine Learning Algorithms for Intrusion Detection System

Intrusion detection system (IDS) is one of the implemented solutions aga...

How do information security workers use host data? A summary of interviews with security analysts

Modern security operations centers (SOCs) employ a variety of tools for ...

Anti-Money Laundering Alert Optimization Using Machine Learning with Graphs

Money laundering is a global problem that concerns legitimizing proceeds...

Intrusion Detection Mechanism Using Fuzzy Rule Interpolation

Fuzzy Rule Interpolation (FRI) methods can serve deducible (interpolated...

FinGAN: Generative Adversarial Network for Analytical Customer Relationship Management in Banking and Insurance

Churn prediction in credit cards, fraud detection in insurance, and loan...

Exoplanet Validation with Machine Learning: 50 new validated Kepler planets

Over 30 'validation', where the statistical likelihood of a transit aris...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.