Less is More: Robust and Novel Features for Malicious Domain Detection

06/02/2020
by   Chen Hajaj, et al.
0

Malicious domains are increasingly common and pose a severe cybersecurity threat. Specifically, many types of current cyber attacks use URLs for attack communications (e.g., C&C, phishing, and spear-phishing). Despite the continuous progress in detecting these attacks, many alarming problems remain open, such as the weak spots of the defense mechanisms. Since machine learning has become one of the most prominent methods of malware detection, A robust feature selection mechanism is proposed that results in malicious domain detection models that are resistant to evasion attacks. This mechanism exhibits high performance based on empirical data. This paper makes two main contributions: First, it provides an analysis of robust feature selection based on widely used features in the literature. Note that even though the feature set dimensional space is reduced by half (from nine to four features), the performance of the classifier is still improved (an increase in the model's F1-score from 92.92% to 95.81%). Second, it introduces novel features that are robust to the adversary's manipulation. Based on an extensive evaluation of the different feature sets and commonly used classification models, this paper shows that models that are based on robust features are resistant to malicious perturbations, and at the same time useful for classifying non-manipulated data.

READ FULL TEXT

page 15

page 16

research
05/25/2020

Adversarial Feature Selection against Evasion Attacks

Pattern recognition and machine learning techniques have been increasing...
research
07/07/2019

Smart Grid Cyber Attacks Detection using Supervised Learning and Heuristic Feature Selection

False Data Injection (FDI) attacks are a common form of Cyber-attack tar...
research
08/13/2019

On Defending Against Label Flipping Attacks on Malware Detection Systems

Label manipulation attacks are a subclass of data poisoning attacks in a...
research
09/04/2019

HinDom: A Robust Malicious Domain Detection System based on Heterogeneous Information Network with Transductive Classification

Domain name system (DNS) is a crucial part of the Internet, yet has been...
research
02/23/2019

Identifying Malicious Web Domains Using Machine Learning Techniques with Online Credibility and Performance Data

Malicious web domains represent a big threat to web users' privacy and s...
research
09/02/2022

TypoSwype: An Imaging Approach to Detect Typo-Squatting

Typo-squatting domains are a common cyber-attack technique. It involves ...
research
06/08/2020

Ensemble-based Feature Selection and Classification Model for DNS Typo-squatting Detection

Domain Name System (DNS) plays in important role in the current IP-based...

Please sign up or login with your details

Forgot password? Click here to reset