PASSVM: A Highly Accurate Online Fast Flux Detection System

06/05/2020 ∙ by Basheer Al-Duwairi, et al. ∙ JUST 0

Fast Flux service networks (FFSNs) are used by adversaries to achieve a high resilient technique for their malicious servers while keeping them hidden from direct access. In this technique, a large number of botnet machines, that are known as flux agents, work as proxies to relay the traffic between end users and a malicious mothership server which is controlled by an adversary. Various mechanisms have been proposed for detecting FFSNs. Such mechanisms depend on collecting a large amount of DNS traffic traces and require a considerable amount of time to identify fast flux domains. In this paper, we propose an efficient AI-based online fast flux detection system that performs highly accurate and extremely fast detection of fast flux domains. The proposed system, called PASSVM, is based on features that are associated with DNS response messages of a given domain name. The approach relies on features that are stored in two local databases, in addition to features that are extracted from the response DNS messages itself. The information in the databases are obtained from Censys search engine and IP Geolocation service. PASSVM is evaluated using three types of artificial neural networks which are: Multilayer Perceptron (MLP), Radial Basis Function Network (RBF), and Support Vector Machines (SVM). Results show that SVM with RBF kernel outperformed the other two methods with an accuracy of 99.557 ms.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 20

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The Domain Name System (DNS) is a core Internet infrastructure element that is implemented as a distributed hierarchical database and is viewed as a crucial backbone of the Internet. It mainly provides a mapping between domain names and their IP addresses, in addition to other important functions that are necessary for the proper functions of websites. At the same time, DNS plays an integral part in the operation of different types of attacks and malicious activities such as DNS amplification attacks macfarland2017best anagnostopoulos2013dns , DNS cache poisoning attacks alharbi2019collaborative jackson2009protecting , malware distribution zhauniarovich2018survey , and botnets singh2019issues patsakis2020encrypted . As millions of new domain names are registered every day, there is a growing concern that many of these domains might belong to botnets and various types of malicious activities.

Botnets represent a significant threat that is continuously evolving with new techniques and architectures. They are used to perform different types of malicious activities such as distributed denial of service attacks, email spam, phishing, and malware distribution. Botherders are continuously developing techniques to hide their malicious activities and to evade detection. Attackers rely on DNS to resolve IP addresses of domain names (e.g., phishing domains, command and control (C&C) servers, etc.) that are used in their attacks. Therefore, it is clear that DNS provides an important information that can reveal different attack activities.

Fast Flux Service Networks (FFSNs) are spacial forms of botnets that are mainly designed to provide resilient and highly available service while evading detection. It is a technique that is adopted by botmasters since 2007 with an increasing activity rate in recent years. The main purpose of fast flux networks is to hide the content server (also called the mothership server), where the malicious content is hosted behind a botnet of compromised machines that are called flux agents. Flux agents are configured by the botmaster in order to serve as proxies that relay traffic to/from the origin server. Similar to content distribution networks (CDNs), FFSNs achieve a high availability by using a technique that mimics the Round Robin Domain Name System (RRDNS) to map domain names and IP addresses of flux agents. In FFSNs, a fast flux domain is mapped to multiple IP addresses that keep changing very fast. This increases the chances that the origin server is reachable from some of the flux agents that are still running and have not blacklisted yet.

The Internet Honeynet project was the first to systemically describe the problem of fast flux networks and their main features saluskyknow . Subsequently, the research community paid more attention to this growing threat and several mechanisms were proposed to address the problem. With the renewed adoption of fast flux networks in major botnets (e.g., SandiFlux Sandiflux and DarkCloud darkcloud ), new fast flux detection mechanisms were proposed in recent years (e.g., fastflux-passive-2 fastflux-passive-5 ). Most of these mechanisms rely on analyzing DNS traffic information that corresponds to fast flux domains in order to characterize their behavior and identify their distinguishing features. In this regard, DNS records can be obtained actively by issuing DNS requests about domain names of fast flux domains that are obtained from email spam campaigns and phishing archives, or through the analysis of passively collected DNS traffic traces.

Detecting fast flux networks accurately and instantly (i.e., online detection) is an important and a challenging problem. While some previously proposed mechanisms have achieved high detection rate, they require long time to collect DNS and other related information from different sources. In this paper, we propose a novel and efficient AI-based online fast flux detection system. The main goal of the proposed system (PASSVM) is to perform online fast flux detection based on a single DNS response message for a given domain name. To achieve this goal, we investigated different Artificial Neural Networks (ANN) models and trained them using features that are stored locally. The features information are based on the A records that are found in the DNS response message. This means that we restrict our feature set to the information that is available in the DNS response itself or can be obtained from local databases that can be downloaded in advance. Therefore, well-known fast flux features that require any form of active query are not needed. The first database is constructed from data that is downloaded in advance from Censys search engine and includes historical information about IPv4 address space. The second database is constructed in advance using data from IP geolocation service iptoasn which provides IP to location and autonomous system number (ASN) information for IP addresses that correspond to a given domain name. Specifically, the main contributions of this paper are:

  1. An efficient and highly accurate online fast flux detection system based on features that are available in a single DNS response message.

  2. Leveraging the data that is made available by Censys search engine censysio about IPv4 address space and data from IP geolocation service.

  3. Two new fast flux features that allow for online fast flux detection are proposed. The two features are based on the database that can be downloaded from Censys search engine.

  4. The proposed system is evaluated using three types of artificial neural network models which are: Multilayer Perceptron (MLP), Radial Basis Function Network (RBF), and Support Vector Machines (SVM). SVM with RBF kernel outperformed the other two methods with an accuracy of 99.557% and a detection time of less than 18 ms.

The remainder of the paper is organized as follows. Section 2 presents a background about fast flux networks. In section 3, we discuss the related work. In section 4, we present the proposed PASSVM system. In Section 5, we discuss different ANN models that are used in the evaluation of the proposed approach. In Section 6, we evaluate our system and discuss the performance of different ANN models based on recent datasets. The conclusion of the paper is given in section 7.

2 Fast Flux Networks

Round robin DNS and content distribution networks (CDNs) are two main techniques that are employed by web servers to achieve load balancing and high availability. In round robin DNS, the authoritative domain name server of a certain domain name is configured to distribute the workload to multiple redundant web servers by mapping the host name of the web server to multiple IP addresses. This mapping keeps changing in a round robin fashion. Each time a client issues a DNS query, the client may obtain a list of IP addresses for the given host name in different order. In CDNs, the content is pushed to a large number of geographically distributed servers. Global load balancing is achieved by providing the client with set of IP addresses of nearby servers. For example, a user in USA, who is trying to access a CDN hosted website, sends a DNS query for that web site, and will get a reply with a set of IP addresses of servers that are hosted in nearby locations within the CDN.

FFSNs employ similar techniques in order to provide a high availability of malicious servers while hiding their real locations. Figure 1 shows the main stages of constructing and operating a fast flux service network. Initially, the botnet herder sets up a mothership server in order to host some sort of malicious content for the purpose of malware distribution, illegal pharmaceutical products sale, or hosting adult content, etc. (step 1). A domain name, such as xyz.com, is assigned for this server. Then, a botnet of fast flux agents is formed and each agent is configured to serve as a proxy server to relay traffic to/from the mothership server (step 2). Flux agents are mainly compromised machines with intermittent connectivity, limited computational power, and low to average bandwidth.

Afterwards, the botnet herder registers the domain name of the FFSN with a set of IP addresses that belong to the fast flux agents botnet (step 3). Therefore, any access to the malicious FFSN domain should go through one of the flux agents that is returned to the client by the DNS system (step 4). It is clear that the botnet of flux agents forms a protection layer for the hidden malicious server. In order to increase the resilience of the network and to evade detection, the botnet herder keeps changing the domain name registration in a fast manner. This type of FFSNs is called a single-flux. There is a more sophisticated type, that is called double-flux, in which the botnet herder also changes the mapping between the authoritative name server of the FFSN and its IP addresses in a fast manner. Therefore, providing a layer of protection for the FFSN’ authoritative name server.

Figure 1: Stages of the life cycle of fast flux service network

Figures 2 and 3 show the DNS lookup result that is obtained using the Unix dig utility for the fast flux domain (flowjob.top.). It can be seen that this fast flux domain is mapped to multiple IP addresses and the mapping keeps on changing over time. For example, the second dig, which was performed seconds after the first one, showed a new set of IP addresses that did not appear in the first dig output, which is a common characteristic of fast flux networks. Previous research studies have identified several features that mainly characterize fast flux domains fastflux-active2 fastflux-passive-2 . These features include:

  • Large number of IP addresses.

    The number of A records that is included within a single DNS response message of a fast flux domain is relatively large. If one or more of the fast flux agents that are associated with the IP addresses are down, a client, that is trying to access the mothership server of the associated domain name, would automatically try another IP address (i.e., another agent) until it succeeds. Registering the domain name with a large number of IP addresses provides high availability of the malicious server as it increases the probability that one of the flux agents is up and running.

  • Large IP growth. In order to avoid blacklisting, the mapping, between a fast flux domain and agent IP addresses, keeps on changing over time. Therefore, the number of IP addresses, that are associated with a certain fast flux domain, becomes large.

  • Low TTL value. Since the mapping between a domain name and IP addresses changes very fast in FFSNs, then the TTL values are kept low. This guarantees that the values expire soon after the fast flux domain is resolved in order for users to obtain the new list of IP addresses.

  • Large number of autonomous systems. The IP addresses, that are returned in response to a DNS query for a fast flux domain, represent compromised machines that belong to different organizations and Internet Service Providers. Therefore, it is expected that IP addresses of these agents belong to multiple autonomous systems.

  • Large number of countries. Previous studies showed that the IP addresses of fast flux domains are usually located in relatively large number of countries. This is expected since attackers register their fast flux domains with a set of IP addressees that are selected randomly from a pool of fast flux agents.

  • Domain names do not last for a long time. The life time of a fast flux domain is relatively short. Attackers tend to register a large number of domains for their FFSNs, where each domain name remains active for a short period of time.

Figure 2: Output of the first dig of the fast flux domain flowjob.top. (performed on April 22 2019)
Figure 3: Output of the second dig of the fast flux domain flowjob.top. (performed on April 22 2019)

3 Related Work

Most of the previous work in the area of fast flux detection (e.g., chen2019deep kokkelkoren2019catching al2015gflux wang2017hiding ) have focused mainly on analyzing DNS traffic traces. Some methods performed active DNS probing to collect DNS records about suspect domains; while other methods used DNS records that were collected passively. Active and passive DNS information collection are usually combined with other information that is collected from different sources such as whois database, IP2location services, and blacklisted domains. In addition, some information are based on active measurements of the delay and other parameters.

Characterization of fast flux networks was first presented in saluskyknow and fastflux-active-1 . In these studies, active DNS-based approach was proposed where the DNS system was queried actively for domain names that were collected from Internet Spam archives and obtained by means of spam traps. DNS A records were analyzed by searching for fast flux domain footprints. These studies provided important insight about the nature of this threat and identified the main characteristics of fast flux domains such as such as low TTL values, large number of IP addresses, geographical distribution of flux agents, sharing of flux agents, and sharing of scam web pages. In fastflux-active-1 , a metric, which is called fluxy-score, is defined and computed over a set of parameters that are related to DNS records for a certain domain in order to determine whether the domain is a fast flux domain or a legitimate.

Konte, Feamster, and Jung revealed similar characteristics and provided an insight about the dynamics of scam hosting infrastructure with a focus on the role of fast flux service networks fastflux-active2 . Other studies (e.g., fastflux-active3 fastflux-active4 ) have focused on botnet detection through fast flux identification. The main problem of active DNS-based detection of fast flux networks is that it incurs high delay because it requires long time to collect DNS records for suspicious domain names. Also, the high rate of DNS queries that is received by authoritative domain name servers, that are under attackers’ direct control, may alert them about such activity.

Other fast flux detection mechanisms (e.g., FluxBuster fastflux-passive-2 ) relied on passive DNS monitoring, where DNS A records are obtained in a passive manner through monitoring DNS traffic. In FluxBuster

, live DNS traffic traces were captured by placing sensors at various strategic locations within an ISP network. The c4.5 decision tree classifier was applied on a set of features that are extracted from the traces and other features that are obtained actively. The reported results showed a high false positive rate and a long detection time due to the requirement of monitoring domains for a long period of time (five days in some cases). The work presented in

fastflux-passive-2 followed a similar approach where DNS traffic traces were collected from a large corporate network. Mathematical and data mining techniques were applied on a set of features that are extracted from the monitored traffic in order to achieve near real-time fast flux detection. A system called Fast-flux hunter was proposed in fastflux-passive-5

. The system combines supervised and unsupervised online knowledge learning based system for fast flux detection based on features extracted from passive DNS traffic. In

fastflux-passive-4, a hybrid fast flux detection method that combines real-time detection and ong term DNS traffic analysis and monitoring was proposed. The method employed a decision tree classifier to achieve an accuracy rate close to 96%.

Generally, a passive fast flux detection approach does not involve direct interaction with the domain name system. This has the advantage of eliminating network delays in obtaining DNS records. Also it prevents false DNS replies that can be provided by attackers who might be controlling authoritative domain name servers while observing a large number of DNS quires. Moreover, it has the advantage of discovering fast flux domains that could potentially appear in different malicious sources such as phishing emails, hackers forums, and online social networks. However, this approach requires processing a huge amount of DNS traffic traces that contain different types of information regarding malicious domains and legitimate domains.

The approach presented in fastflux-realtime1 does not rely mainly on collecting DNS information. Instead, it relies on certain intrinsic characteristics of fast flux networks with the observation that it is expected to have long delays when fetching a document through a flux agent. The flux agent acts as a proxy to a back-end mothership server. On the other hand, the required time to download a similar content from a legitimate server is short. The scheme can detect fast flux domains within few seconds whenever a client attempts to download a document from a certain server. The scheme involves issuing additional HTTP requests to verify the legitimacy of the web-server. Also it can be applied in an active mode to identify fast flux domains in advance. However, this scheme has several limitations as it requires live interaction with malicious servers, and it does not discover flux agents that are hosted on a powerful PCs. Also it can result in many false positives in case legitimate servers are hosted at low-level hardware.

Previous fast flux detection mechanisms suffer from major drawbacks in the sense that they usually require considerable amount of time to actively or passively collect information about the flux DNS. Also, the mechanisms can not detect new fast flux domains before collecting enough data about them. Therefore, the DNS traffic analysis requires a long time of computations in order to achieve acceptable detection accuracy. Different than the previous work, the proposed fast flux detection system of this paper performs an on-the-fly highly accurate fast flux domain detection by leveraging information about IPv4 address space that is obtained in advance and stored in local databases. As a result, it eliminates the long-time monitoring and analysis of the DNS traffic and makes it possible to detect fast flux domains using only a single DNS response message.

4 The Proposed PASSVM System

4.1 Overview of the proposed system

An online fast flux detection system should be able to perform fast flux detection on the fly based on the available A records in a single DNS response message for the domain name. In other words, the system should directly answer whether a domain name is a fast flux domain or not without performing active DNS probing or seeking additional information from external sources during the decision making time. This requirement is very important to avoid delays and prevent additional traffic overhead during the classification process. To achieve this goal, PASSVM relies mainly on information about IPv4 addresses that are collected in advance (for example, one day before performing fast flux classification) from two main sources. The first source is the Censys search engine censysio , which is a search engine that performs Internet wide scanning using the Zmap fast Internet scanner censys15 . Censys performs a daily IPv4 address space scanning and can be obtained by users using special APIs. Alternatively, Zmap can be used directly to perform the daily scanning in advance since it has the ability to scan the whole IPv4 address space within less than one hour zmap . The second source is the IP geolocation service iptoasn , which provides the mapping between IP addresses and their locations (cities and countries). It also provides the ASN number for each IP address. IP geolocation data can be downloaded and used locally. In addition, the proposed approach uses specific features that are extracted from the DNS response message of the request.

Figure 4 shows the proposed system architecture. The system can be used as a module within a local DNS resolver, where suspicious DNS requests are inspected by the PASSVM. Any domain name, that has many IP addresses (for example , more than 5 IPs), is considered a suspicious domain. As depicted in Figure 4, a feature set is extracted from the DNS response and from the local databases that were constructed using Censys and IP geolocation. An artificial neural network classifier is then used to decide whether the domain is a fast-flux domain or a legitimate domain.

Figure 4: The proposed PASSVM system architecture

4.2 Fast flux features set

In this subsection, we describe the fast flux features that are used in the detection process of the proposed AI-based system. In order to achieve a fast classification model, we used features that can be obtained from the locally stored IPv4 address space information or from the DNS response message.

4.2.1 Censys-based fast flux features

Two important fast flux features that complement other known fast flux features are introduced in this paper. Here we discuss the two new features. Given a DNS reply of a fast flux domain name, it is expected that some of the flux agents, of the list of IP addresses in the DNS reply, might be down at a given time. The reason is that these agents are machines that belong to normal end users in different organizations and can be powered off or get disconnected from the Internet at any time. Consequently, querying Censys with a set of IP addresses that belong to a fast-flux domain will not return information about all the addresses in the search query. This is due to the fact that Censys data is obtained by performing a daily Internet wide scanning for the IPv4 address space using Zmap fast Internet scanner, and there is a high probability that the scanner will not find some of the fast flux agents because they are offline at the time of the scanning. On the other hand, IP addresses that correspond to a legitimate domain name are usually well-maintained servers. Therefore, they are not expected to be offline and there is a high probability that they are reachable by the Zmap scanner. This means that querying Censys with a list of IP addresses that correspond to a legitimate domain will return information about most of them. Hence, the ratio of IP addresses that are returned from Censys to the number of IP addresses that are submitted in the query represents an important feature to distinguish between fast flux domains and legitimate domains. This is the first new feature that is introduced in this paper.

This second important feature, that can be extracted from Censys search results, is related to the overall number of open ports that are discovered by Censys for the set of IP addresses of a certain domain name. In the case of a legitimate domain name, it is expected that all hosting servers, of the same domain, have similar configurations that result in having the same open ports on the hosting servers. On the other hand, in the case of a fast flux domain, there is a high chance that other port numbers are open, in addition to the ones that are configured by the attacker. This is because the configuration of the infected machines in the FFSNs are heterogeneous and belong to many different users. Hence, the number of open port numbers can indicate whether a domain name is a fast flux or a legitimate domain.

For illustration, Figure 5 shows the results that are returned from the Censys search engine for a set of IP addresses that belong to the fast-flux domain (hex001.info.). Censys has returned information about IP addresses out of the IP addresses that were submitted in the query. On the other hand, Figure 6 shows the results that are returned from the Censys search engine for a set of IP addresses that belong to the legitimate domain (uefa.com.). As shown in the figure, Censys has returned information about all of the IP addresses that were submitted in the query. Hence, the IP ratio of the fast-flux domain (hex001.info.) is , and the IP ratio of the legitimate domain (uefa.com.) is . In addition, the figures show the number of open ports of the IP addresses in the query. For the fast flux domain, there are five distinct open port numbers (Ports 443, 3389, 1433, 5432, and 80). However, for the legitimate domain there is only one open port number (Port 443). Hence, the number of open ports of fast flux domains is relatively greater than that of legitimate domains.

Figure 5: Example of the results that are returned by Censys for 10 IP addresses of the fast flux domain hex001.info. (performed on December 7 2019)
Figure 6: Example of the results that are returned by Censys for 20 IP addresses of the legitimate domain uefa.com. (performed on December 7 2019)

In summary, the two newly introduced features for fast-flux detection are:

  • IP ratio: The ratio of the number of IP addresses that is returned from Censys to the number of IP addresses that is submitted in the query.

  • Ports: The number of distinct open port protocols for all of the IP addresses that are returned from Censys search engine.

4.2.2 IP geolocation-based features

Censys search engine provides information about geographical distribution of the IP addresses in a query to its database. This includes information such as countries, cites, and ASN numbers. However, sometimes Censys does not necessarily provide this information about all of the IP addresses that are found in a given DNS response. Therefore, IP geolocation service is used to obtain the information for all of IP addresses in a DNS response message that belongs to a given domain name. In particular, we define the following three features that are based on the number of countries and the number of ASNs provided by the IP geolocation service:

  1. : This feature is defined as the ratio of the number of distinct ASNs for the set of IP addresses in a given DNS response message to the total number of IP addresses in the DNS response. For example, if the number of IP addresses that is returned in a DNS response message for a certain domain is , and the IPs belong to distinct ASNs, then the equals to .

  2. Regions: This feature defines the number of distinct countries for all of IP addresses in a given DNS response message.

  3. RegionalSpread: This feature is defined as the ratio of the number of distinct countries, for all of IP addresses in a given DNS response message, to the number of IP addresses in the response message.

4.2.3 DNS-Response based features

The DNS response message itself contains important features that contribute significantly in distinguishing fast-flux domains. This includes the following features:

  1. DomainLeangth: This feature is defined as the number of characters in the domain name. Usually, malicious domains, including fast flux domains, have long domain names. Therefore, the domain name length is included as one of the features for fast flux detection.

  2. IPCount: This feature is defined as the number of A records that are found in a DNS response message for a given domain. As explained in Section 2, this number is expected to be relatively large for fast flux domains.

  3. TTL This feature is defined as the TTL value for the DNS reply message for a given domain. As explained in Section 2, fast flux domains usually have very short TTL values.

Table 1 summarizes the main features that are used in the proposed system for fast flux detection. In total, features are used. The system obtains the features from the DNS response message or from two databases that are stored locally. The information in the databases are obtained from Censys and IP geolocation services. Clearly, it is possible to perform the online and highly accurate fast flux detection using the proposed approach .

first second third
F1 DomainLength The length of a domain name
F2 Regions The number of countries where the IPs are located
F3 Ports The overall number of open ports for all of the IPs
F4 IPCount The number of IP addresses in a DNS response message
F5 IP ratio The ratio of the returned IPs (by Censys) to the number of IPs in the DNS response message
F6 TTL The TTL value of the DNS response message
F7 ASN The ratio of the number of distinct ASNs to the number of IPs in the DNS response message
F8 RegionalSpread The ratio of the number of distinct countries to the number of IPs the DNS response message
Table 1: The main features that are used in the system

5 Classification Methods

Artificial Neural Network (ANN) models have been used widely for classification of data into classes. These models are powerful tools for supervised machine learning where a model can classify a data input into one of classes. During the training of an ANN model, data inputs are vectors of values and their corresponding classes. In our learning approach, vectors of the selected features along with their classes (Fast Flux or Legitimate domain) have been used to train the ANN models. 10-fold method is used on the input data for training and validation. Three types of ANN models were used which are: Multilayer Perceptron (MLP) or Feedforward Neural Network, Radial Basis Function Network (RBF), and Support Vector Machines (SVM). SVM outperformed the other two methods. Below is a brief review about each one of the ANNs.

5.1 MLP and RBF neural networks

In MLP, the feature input vector () is multiplied by the weight matrix (W

) and a bias vector (

) is added to the product Russell2009

. Then an activation function

f is applied to rescale the result to a value between 0 and 1. Different activation functions can be used such as the Sigmoid and the Softmax functions MLP1 MLP2 . Figure 7 depicts a general MLP network. Hence, the output of the network is calculated using Equation 1.

(1)
Figure 7: Structure of a Multilayer Perceptron Network

Softmax function has also shown high performance when it is used as the activation function. Softmax function is given in Equation 2 and has been used for the output layer in our experiments.

(2)

On another hand, Radial Basis Function (RBF) networks applies radial functions on the input vector (). A popular radial function that is widely used is the Gaussian function which is shown in Equation 3. is a center point where the function decreases or increases monotonically based on the distance from . is the function radius.

(3)

In RBF networks, the hidden layer represents the radial functions as shown in Figure 8. Softmax or Gaussian activation functions can be used. Both functions were used in our experiments as we will discuss in the evaluation section. If Gaussian basis function is used, the output of the network is computed using Equation 4 RBFNN1 RBFNN2 RBFNN3 .

Figure 8: Structure of a Radial Basis Function Network
(4)

IBM SPSS tool was used in our experiments for both MLP and RBF nural networks. The identity function is used for the output layer of the RBF network.

5.2 SVM and RBF kernel

Support Vector Machine (SVM) finds a separating hyperplane with the maximum margin of separation. In other words, it computes the maximum distance between the separating hyperplane and the closest data points (support vectors). The design of a single neuron for fast flux detection can be interpreted as a classification problem. Therefore, the synthesis of fast flux detection neural network can be solved by a set of

independent SVMs; where is the number of neurons in the neural network Burges1998 .

During training of the SVM ANN, (, ),

represent the feature training patterns. Each feature input vector

belongs to one of two classes ( = -1 or +1). -1 is a fast flux domain and +1 is a legitimate domain. Assuming the feature vectors are linearly separable, there exists a separating hyperplane as depicted in Equation 5.

(5)

The weights w and the bias b vector can be rescaled to get Equation 6.

(6)

The corresponding weights and bias represent the optimal hyperplane. To compute the optimality using Lagrange multipliers , the objective function is to minimize Equation 7.

(7)

subject to:

(8)

Solving Equation with its constraints, the optimum Lagrange multipliers are used to compute the matrix w for all neurons as depicted in Equation 9. For more details, please refer to Haykin1998 and Casali2006 .

(9)

However, after performing our experiments, the feature input vectors are not linearly separable. To solve this problem, different kernel techniques can be used along with the SVM to map the input vectors into higher dimensions using a non-linear mapping function in order to make them linearly separable. The result is a feature space of the input vectors. Introducing the feature space function into Equation 7, the objective function can be rewritten as in Equation 10.

(10)

A kernel function notation can be used in Equation 10, which results in Equation 11 sharma2018 Cho2008 .

(11)

We have used different kernel functions (feature mappings) and the best results were achieved by the Radial Basis Function (RBF) kernel. The RBF kernel is depicted in Equation 12. tool was used in our experiments which provides different kernels Joachims99 .

(12)

6 Evaluation

IBM SPSS tool was used for both MLP and RBF neural networks under Windows 10 platform machine. For SVM, we have used tool under Ubuntu 18.04 LTS machine. The tool provides different kernel implementations Joachims99 . The dataset for conducting the experiments is described in subsection 6.1. Subsection 6.2 discusses the performance of different ANN techniques.

6.1 The dataset

For the evaluation of PASSVM, and taking into consideration that our system relies on information provided by Censys search engine about IP addresses that correspond to fast flux domains, and because Censys started to provide access to their daily IPv4 full address scans since late 2017, we restricted our fast flux dataset to includ only fast flux domains that have appeared during 2018 and 2019. In this work, we mainly collected fast flux domains that were active during April 2018 to January 2019. To achieve this goal, a seed of 80 confirmed fast flux domains were collected manually from recently published papers AGD fastflux-passive-2 . In addition, some domains appeared in recent tweets about new fast flux domains. Then, we performed active DNS lookup for every unique domain name in the initial list using the Linux dig utility for a period of two months. Based on the DNS response messages of the DNS lookups, the list of the IP addresses of the domain names were extracted. For each IP address that was resolved from the initial fast flux dataset, a query to VirusTotal VirusTotal was performed in order to get an updated list of the domain names that had been resolved since April 2018 along with the date associated with each domain name. After that, for each domain name, a query is sent to VirusTotal get the list of the IP addresses that were resolved for the the associated domain names and their dates. In total, we were able to obtain fast flux domains. The dataset of the legitimate domains were obtained by performing active DNS query for Alexa’s top million domains Alexa . Then, we filtered the domain names and included only domains with or more IP addresses in their DNS response messages. The result is a dataset of legitimate domain names and their corresponding IP addresses.

For each domain and its corresponding IP addresses in the aforementioned datasets, a query is submitted to Censys search engine using APIs. The results are received as JSON objects similar to the example shown in Figure 9. These objects were parsed out to get the information of interest such as the number of distinct port numbers, the number of IP addresses, the number of countries, cities, etc. Moreover, we used the geolocation database available at iptoasn to get the ASN numbers and the countries that are associated with all of the IP addresses of a given domain name.

Figure 9: A sample of a JSON object returned by Censys

6.2 Performance Evaluation

The proposed PASSVM system has been evaluated using Multilayer Perceptron (MLP), Radial Basis Function Network (RBF), and Support Vector Machines (SVM) as described in Section 5

. The feature selection is considered as an important step in training artificial neural networks. Therefore, the normalized importance of the different fast flux network features is considered in our study and was evaluated during the training experiments. Hence, the normalized importance of features for MLP network and RBF network are shown in Figure

10 and Figure 11, respectively. Based on Figure 10, it can be noted that the DomainLength, Regions, and Ports have the highest importance. However, for RBF network, the features IPCount, IPs, TTL and Ports have the highest importance as shown in Figure 11. This strongly suggests that all the selected features that we have chosen in this study are important and play a major role in distinguishing fast-flux domains from legitimate domains. It also provides an evidence that the two newly introduced features (the number of open ports and the IP ratio) are of high importance to achieve highly accurate domain name classification.

Figure 10: Normalized weight of features in MLP Network
Figure 11: Normalized weight of features in RBF Network

We have performed several experiments to evaluate the performance of the different ANNs using the 10-fold cross-validation method in order to obtain the training model. In every run of the experiment, about 90% of the dataset is trained, and 10% of the dataset is tested. Results show that the Support Vector Machine with RBF kernel outperforms other ANNs with an accuracy of as shown in Figure 12. Table 2 compares the accuracy, false positive rates, and false negative rates of the different classifiers. It can be seen that SVM with RBF kernel has the highest performance in terms of accuracy and false negative rate with a very low false positive rate. Such a very low false negative can be cached in a lookup table for a practical adaptation PASSVM. It is important to point out that the overall average execution time of this classifier for training and testing does not exceed 40 ms, and the average time to test one DNS record was less than 18 ms. After the comparison with other ANN algorithms, SVM with RBF kernel is chosen because of its practicality in performing very fast and highly accurate fast flux detection.

Classifier Accuracy FPR FNR
SVM (RBF Kernel) 99.557 0.008 0.004
RBF (Softmax) 97.4 0.0058 0.1001
RBF (Guassian) 96.5 0.0066 0.1284
MLP 99.0 .0048 0.0483
Table 2: Comparison of different ANN classifiers
Figure 12: The accuracy of different ANN techniques

6.3 Comparison with the state-of-the-art

Table 3 presents a comparison between the proposed PASSVM system, which is based on SVM with RBF kernel, with the state-of-the-art mechanisms for fast flux detection. The mechanisms include GRADE grade , FF-Hunter fastflux-passive-5 , FluxBuster FluxBuster and AGD . The comparison criteria are based on whether the detection is performed online or offline, the capability of performing a detection based on a single DNS record, the time for training and testing, the accuracy, and the used memory that were reported by the authors of the methods.

As shown in Table 3, PASSVM has the best performance in terms of time, accuracy, and used memory. Also, it has the capability of performing online fast flux detection based on a single DNS packet given that the system is trained in advance. This allows for the detection of new fast flux domains as they appear in the wild when users tries to access them. Hence, PASSVM can be used by organizations to provide a passive online fast flux detection. In addition, the system AI-model can be trained again and again as more dataset becomes available to enhance its accuracy.

Previously proposed systems take more time to classify a domain name because they require features that should be obtained from different Internet sources rather than from local databases. For example, GRADE performs the fast flux detection task based on the entropy of domains proceeding nodes for all A records, and the standard deviation of the round trip time for all of the A records. This requires performing tracerout and real-time measurement of the round trip time for all of the records, which incurs high overhead and has a major problem of possible failure due to filtering of the ICMP messages. The system proposed in

AGD analyzes live traffic that is collected from the upper DNS hierarchy by applying literal composition to identify DGA-generated domains. Then it clusters the domains based on their literal features and the edit-distance. Extreme machine learning (EML) is used to classify the domain clusters into fast-flux domains and legitimate domains based on different features that require active query of the whois database. Some of the 14 features that were used in FF-Hunter require collecting a large number of DNS records for a given domain, which means that it can not perform detection based on a single DNS record as it takes extra time to collect the required features. FluxBuster FluxBuster relies on characteristics that are obtained from passive DNS traffic traces, in addition to active data collection in order to accurately perform domain names classification. Moreover, it does not provide fast detection and requires the collection of a large number of DNS records for each domain.

Algorithm Online? Single Packet Detection? Time used (s) Accuracy Used memory (MB)
GRADE grade yes yes 30.48 98.47 85.34
FF-Hunter fastflux-passive-5 yes NO 43.87 98 92.71
FluxBuster FluxBuster No NO 198.45 99.4 102.32
Zang, Gong, Mo, Jakalan, Ding AGD yes NO 32.71 99.1 105.64
PASSVM yes yes 0.04 99.557 40.00
Table 3: Comparison with the state-of-the-art systems

7 Conclusions

Fast flux service networks provide Internet adversaries with the capability to hide their malicious servers while maintaining a high availability. There is a pressing need to identify fast flux networks in a short time in order to minimize the risk of accessing malicious websites and hence spreading malware. In this paper, a novel AI-based online fast flux detection system is proposed. The proposed PASSVM system applies artificial intelligence algorithms to identify fast flux domains based on features that are associated with a single DNS record. The features are obtained directly from the record itself, in addition to information that is available in local databases. The databases information are obtained from the Censys search engine and IP geolocation service. PASSVM system performs online fast flux detection with high accuracy. Experimental evaluations demonstrate that SVM with RBF kernel outperforms other artificial neural networks and achieves high accuracy of 99.557% with a low false negative rate of 0.4%. Such a low rate can be cached in a lookup table for practical adaptation of the system in order to achieve a zero rate. Compared with the state-of-the-art fast flux detection systems, PASSVM achieves the best performance in terms of accuracy, time, and used memory. Also, the system approach makes it practical to be employed within organizational networks so that employees do not access malicious domains, and hence it prevents the spread of malware infections.

Acknowledgement

We thank Censys team for providing us with an access to the Internet wide scanning database in order for us to conduct the research. We also thank VirusTotal team for providing us a private API key to access their data for collecting the fast flux dataset.

References

  • (1) MacFarland, D.C., Shue, C.A. and Kalafut, A.J.: The best bang for the byte: Characterizing the potential of DNS amplification attacks. Computer Networks, 116, 12-21 (2017)
  • (2) Anagnostopoulos, M., Kambourakis, G., Kopanos, P., Louloudakis, G. and Gritzalis, S.: DNS amplification attack revisited. Computers & Security, 39, 475-485 (2013)
  • (3) Alharbi, F., Chang, J., Zhou, Y., Qian, F., Qian, Z. and Abu-Ghazaleh, N.: Collaborative Client-Side DNS Cache Poisoning Attack. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications, 1153-1161 (2019)
  • (4) Jackson, C., Barth, A., Bortz, A., Shao, W. and Boneh, D.: Protecting browsers from DNS rebinding attacks. ACM Transactions on the Web (TWEB), 3(1), 1-26 (2009)
  • (5) Zhauniarovich, Y., Khalil, I., Yu, T. and Dacier, M.: A survey on malicious domains detection through DNS data analysis. ACM Computing Surveys (CSUR), 51(4), 1-36 (2018)
  • (6) Singh, M., Singh, M. and Kaur, S.: Issues and challenges in DNS based botnet detection: A survey. Computers& Security (2019)
  • (7) Patsakis, C., Casino, F. and Katos, V.: Encrypted and covert DNS queries for botnets: Challenges and countermeasures. Computers & Security, 88, p.101614 (2020)
  • (8) Salusky, W. and Danford, R., Know Your Enemy: Fast-Flux Service Networks. The Honeynet Project (2007)
  • (9) Sandiflux: Another Fast Flux infrastructure used in malware distribution emerges. https://www.proofpoint.com/us/threat-insight/post/sandiflux-another-fast-flux-infrastructure-used-malware//-distribution-emerges. Accessed 15 March 2019
  • (10) Crowder, W., Dunker, N.: Dark cloud network facilitates crimeware. https://www.riskanalytics.com/wp-content/uploads/2017/10/Dark_Cloud_Network_Facilitates_Crimeware.pdf. Accessed 10 March 2019
  • (11) Massa, D.: Fast Flux Service Network Detection via Data Mining on Passive DNS Traffic. Information Security, p.463 (2018)
  • (12) Almomani, A.: Fast-flux hunter: a system for filtering online fast-flux botnet. Neural Computing and Applications, 29(7), 483-493 (2018)
  • (13) Free IP address to ASN database. https://iptoasn.com. Accessed 10 Sept. 2019
  • (14) Durumeric, Z., Adrian, D., Mirian, A., Bailey, M. and Halderman, J.A.: A search engine backed by Internet-wide scanning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, . 542-553 (2015)
  • (15) Konte, M., Feamster, N. and Jung, J.: Dynamics of online scam hosting infrastructure. In International conference on passive and active network measurement, 219-228 (2009)
  • (16)

    Chen, X., Li, G., Zhang, Y., Wu, X. and Tian, C.: A Deep Learning Based Fast-Flux and CDN Domain Names Recognition Method. In Proceedings of the 2019 2nd International Conference on Information Science and Systems, 54-59 (2019)

  • (17) Kokkelkoren: Catching Flux-networks in the open. Master’s thesis, University of Twente (2019)
  • (18) Al-Duwairi, B., Al-Hammouri, A., Aldwairi, M. and Paxson, V.: GFlux: A google-based system for Fast Flux detection. In 2015 IEEE Conference on Communications and Network Security (CNS), 755-756 (2015)
  • (19) Wang, Z., Qin, M., Chen, M. and Jia, C.: Hiding fast flux botnet in plain email sight. In International Conference on Security and Privacy in Communication Systems, 182-197 (2017)
  • (20) Holz, T., Gorecki, C., Rieck, K. and Freiling, F.C.: Measuring and Detecting Fast-Flux Service Networks. In NDSS (2008)
  • (21) Yadav, S., Reddy, A.K.K., Reddy, A.N. and Ranjan, S., 2010, November. Detecting algorithmically generated malicious domain names. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement, 48-61 (2010)
  • (22) Zhao, D. and Traore, I., 2012, November. P2P botnet detection through malicious fast flux network identification. In 2012 Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 170-175 (2012)
  • (23) Perdisci, R., Corona, I. and Giacinto, G.: Early detection of malicious flux networks via large-scale passive DNS traffic analysis. IEEE Transactions on Dependable and Secure Computing, 9(5), 714-726 (2012)
  • (24) Hsu, C.H., Huang, C.Y. and Chen, K.T.: Fast-flux bot detection in real time. In International Workshop on Recent Advances in Intrusion Detection, 464-483 (2010)
  • (25) Censys Search Engine. https://censys.io/. Accessed 10 Aug 2019
  • (26) Durumeric, Z., Wustrow, E. and Halderman, J.A.: ZMap: Fast Internet-wide scanning and its security applications. In Presented as part of the 22nd USENIX Security Symposium, 605-620 (2013)
  • (27) Russell, S. and Norvig, P.: Artificial intelligence: a modern approach. Prentice Hall Press. Upper Saddle River, NJ, USA (2009)
  • (28) Kusakunniran, W., Prachasri, N., Dirakbussarakom, N. and Yangchaem, D.: Distinguishing ACL patients from healthy individuals using multilayer perceptron on motion patterns. In 2017 9th International Conference on Knowledge and Smart Technology (KST), 1-5 (2017)
  • (29)

    Li, Y., Tang, G., Du, J., Zhou, N., Zhao, Y. and Wu, T.: Multilayer perceptron method to estimate real-world fuel consumption rate of light duty vehicles. IEEE Access, 7, 63395-63402 (2019)

  • (30) Yu, H., Reiner, P.D., Xie, T., Bartczak, T. and Wilamowski, B.M.: An incremental design of radial basis function networks. IEEE transactions on neural networks and learning systems, 25(10), 1793-1803 (2014)
  • (31) Chouhan, S.S., Kaul, A., Singh, U.P. and Jain, S.: Bacterial foraging optimization based radial basis function neural network (BRBFNN) for identification and classification of plant leaf diseases: An automatic approach towards plant pathology. IEEE Access, 6, 8852-8863 (2018)
  • (32) Raitoharju, J., Kiranyaz, S. and Gabbouj, M.: Training radial basis function neural networks for classification via class-specific clustering. IEEE transactions on neural networks and learning systems, 27(12), 2458-2471 (2015)
  • (33)

    Burges, C.J.: A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 2(2), 121-167 (1998)

  • (34) Haykin, S.: Neural networks: a comprehensive foundation. Prentice Hall PTR. Upper Saddle River, NJ, USA (1998)
  • (35) Casali, D., Costantini, G., Perfetti, R. and Ricci, E.: Associative memory design using support vector machines. IEEE Transactions on Neural Networks, 17(5), 1165-1174 (2006)
  • (36) Sharma, G., Panwar, A., Nasiruddin, I. and Bansal, R.C.: Non-linear LS-SVM with RBF-kernel-based approach for AGC of multi-area energy systems. IET Generation, Transmission & Distribution, 12(14), 3510-3517 (2018)
  • (37) Cho, B.H., Yu, H., Lee, J., Chee, Y.J., Kim, I.Y. and Kim, S.I.: Nonlinear support vector machine visualization for risk factor analysis using nomograms and localized radial basis function kernels. IEEE Transactions on Information Technology in Biomedicine, 12(2), 247-256 (2008)
  • (38) Joachims, T.: Making large-scale SVM learning practical. MIT Press, Cambridge, MA 169–184 (1999)
  • (39) Zang, X.D., Gong, J., Mo, S.H., Jakalan, A. and Ding, D.L.: Identifying fast-flux botnet with AGD names at the upper DNS hierarchy. IEEE Access, 6, 69713-69727 (2018)
  • (40) VirusTotal. https://www.virustotal.com. Accessed 10 May 2020
  • (41) Alexa-Top Sites, https://www.alexa.com/topsites
  • (42) Lin, H.T., Lin, Y.Y. and Chiang, J.W., 2013. Genetic-based real-time fast-flux service networks detection. Computer Networks, 57(2), pp.501-513.
  • (43) Perdisci, R., Corona, I. and Giacinto, G., 2012. Early detection of malicious flux networks via large-scale passive DNS traffic analysis. IEEE Transactions on Dependable and Secure Computing, 9(5), pp.714-726.