NetFlow Datasets for Machine Learning-based Network Intrusion Detection Systems

by   Mohanad Sarhan, et al.

Machine Learning (ML)-based Network Intrusion Detection Systems (NIDSs) have proven to become a reliable intelligence tool to protect networks against cyberattacks. Network data features has a great impact on the performances of ML-based NIDSs. However, evaluating ML models often are not reliable, as each ML-enabled NIDS is trained and validated using different data features that may do not contain security events. Therefore, a common ground feature set from multiple datasets is required to evaluate an ML model's detection accuracy and its ability to generalise across datasets. This paper presents NetFlow features from four benchmark NIDS datasets known as UNSW-NB15, BoT-IoT, ToN-IoT, and CSE-CIC-IDS2018 using their publicly available packet capture files. In a real-world scenario, NetFlow features are relatively easier to extract from network traffic compared to the complex features used in the original datasets, as they are usually extracted from packet headers. The generated Netflow datasets have been labelled for solving binary- and multiclass-based learning challenges. Preliminary results indicate that NetFlow features lead to similar binary-class results and lower multi-class classification results amongst the four datasets compared to their respective original features datasets. The NetFlow datasets are named NF-UNSW-NB15, NF-BoT-IoT, NF-ToN-IoT, NF-CSE-CIC-IDS2018 and NF-UQ-NIDS are published at for research purposes.



There are no comments yet.


page 5


Towards a Standard Feature Set of NIDS Datasets

Network Intrusion Detection Systems (NIDSs) datasets are essential tools...

An Explainable Machine Learning-based Network Intrusion Detection System for Enabling Generalisability in Securing IoT Networks

Machine Learning (ML)-based network intrusion detection systems bring ma...

Benchmarking the Benchmark – Analysis of Synthetic NIDS Datasets

Network Intrusion Detection Systems (NIDSs) are an increasingly importan...

Bridging the gap to real-world for network intrusion detection systems with data-centric approach

Most research using machine learning (ML) for network intrusion detectio...

Feature Extraction for Machine Learning-based Intrusion Detection in IoT Networks

The tremendous numbers of network security breaches that have occurred i...

Feature Analysis for ML-based IIoT Intrusion Detection

Industrial Internet of Things (IIoT) networks have become an increasingl...

Data Analytics-enabled Intrusion Detection: Evaluations of ToN_IoT Linux Datasets

With the widespread of Artificial Intelligence (AI)- enabled security ap...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.