Domain-Embeddings Based DGA Detection with Incremental Training Method

09/21/2020
by   Xin Fang, et al.
0

DGA-based botnet, which uses Domain Generation Algorithms (DGAs) to evade supervision, has become a part of the most destructive threats to network security. Over the past decades, a wealth of defense mechanisms focusing on domain features have emerged to address the problem. Nonetheless, DGA detection remains a daunting and challenging task due to the big data nature of Internet traffic and the potential fact that the linguistic features extracted only from the domain names are insufficient and the enemies could easily forge them to disturb detection. In this paper, we propose a novel DGA detection system which employs an incremental word-embeddings method to capture the interactions between end hosts and domains, characterize time-series patterns of DNS queries for each IP address and therefore explore temporal similarities between domains. We carefully modify the Word2Vec algorithm and leverage it to automatically learn dynamic and discriminative feature representations for over 1.9 million domains, and develop an simple classifier for distinguishing malicious domains from the benign. Given the ability to identify temporal patterns of domains and update models incrementally, the proposed scheme makes the progress towards adapting to the changing and evolving strategies of DGA domains. Our system is evaluated and compared with the state-of-art system FANCI and two deep-learning methods CNN and LSTM, with data from a large university's network named TUNET. The results suggest that our system outperforms the strong competitors by a large margin on multiple metrics and meanwhile achieves a remarkable speed-up on model updating.

READ FULL TEXT
research
05/02/2019

Continuous Learning for Large-scale Personalized Domain Classification

Domain classification is the task of mapping spoken language utterances ...
research
11/15/2022

Detecting Malicious Domains Using Statistical Internationalized Domain Name Features in Top Level Domains

The Domain Name System (DNS) is a core Internet service that translates ...
research
08/06/2022

Detecting Algorithmically Generated Domains Using a GCNN-LSTM Hybrid Neural Network

Domain generation algorithm (DGA) is used by botnets to build a stealthy...
research
12/28/2021

FRIDA – Generative Feature Replay for Incremental Domain Adaptation

We tackle the novel problem of incremental unsupervised domain adaptatio...
research
11/21/2018

Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings

Domain generation algorithms (DGAs) are frequently employed by malware t...
research
11/09/2022

Domain-incremental Cardiac Image Segmentation with Style-oriented Replay and Domain-sensitive Feature Whitening

Contemporary methods have shown promising results on cardiac image segme...
research
11/01/2017

Killing Two Birds with One Stone: Malicious Domain Detection with High Accuracy and Coverage

Inference based techniques are one of the major approaches to analyze DN...

Please sign up or login with your details

Forgot password? Click here to reset