Ensemble-based Feature Selection and Classification Model for DNS Typo-squatting Detection

06/08/2020 ∙ by Abdallah Moubayed, et al. ∙ Western University 0

Domain Name System (DNS) plays in important role in the current IP-based Internet architecture. This is because it performs the domain name to IP resolution. However, the DNS protocol has several security vulnerabilities due to the lack of data integrity and origin authentication within it. This paper focuses on one particular security vulnerability, namely typo-squatting. Typo-squatting refers to the registration of a domain name that is extremely similar to that of an existing popular brand with the goal of redirecting users to malicious/suspicious websites. The danger of typo-squatting is that it can lead to information threat, corporate secret leakage, and can facilitate fraud. This paper builds on our previous work in [1], which only proposed majority-voting based classifier, by proposing an ensemble-based feature selection and bagging classification model to detect DNS typo-squatting attack. Experimental results show that the proposed framework achieves high accuracy and precision in identifying the malicious/suspicious typo-squatting domains (a loss of at most 1.5 that used the complete feature set) while having a lower computational complexity due to the smaller feature set (a reduction of more than 50 feature set size).

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The Domain Name System (DNS) protocol is an important pillar in the Internet’s current and future architecture [2, 3, 4, 5]. This is because it is the standard mechanism for name to IP address resolution [2]. Moreover, it helps users to determine the location of servers and mailing hosts, resulting in a direct impact on the data exchange process [2, 3].
However, DNS is vulnerable to a variety of security threats and attacks, as illustrated by the recent DNS attacks [6, 7]. One example is the distributed denial of service (DDoS) attack on Dyn in October 2016 which resulted in a significant portion of America’s Internet Service to go down [8, 9]. Another example is the attack on a Brazilian Bank’s website. During this attack, attackers rerouted the traffic targeted to the bank’s website to their own servers. This was done by changing the DNS registrations of all the bank’s domains, resulting in many users divulging their authentication information to the malicious attackers [10]. These vulnerabilities can be mainly attributed to the lack of data integrity and origin authentication processes included within the DNS protocol structure.
One such vulnerability that the DNS protocol suffers from is that of typo-squatting. Typo-squatting refers to the registration of a domain name that is extremely similar to that of an existing popular brand with the goal of redirecting users to malicious/suspicious websites. This is done by registering confusingly similar domain names that the user might not pay attention to [11]. For example, the www.paypal.com domain can be easily confused with www.paypa1.com domain. The danger of typo-squatting is that it can lead to information threat, corporate secret leakage, and can facilitate fraud [12, 13].
Hence, it is crucial that DNS is able to tolerate failure and is resilient to attacks given its importance to the proper functioning of the Internet [7]. This has led to various researchers proposing different mechanisms to combat and protect against failures and attacks. One such mechanism is the DNSSEC protocol which aims at addressing some of the security vulnerabilities of DNS by providing data integrity and origin authentication [14]. Yet, DNSSEC still can not address other attacks such as amplified denial of service attacks [7, 15]. Thus, it is important that more efficient detection mechanisms are implemented that can protect systems from the various attacks by better identifying malicious queries.
This paper builds on our previous work in [1]

which only proposed majority-voting based classifier to detect DNS typo-squatting. In contrast, this work proposes an ensemble-based feature selection and classification (EFSBC) model to detect DNS typo-squatting attack. This is done to reduce the complexity of the DNS typo-squatting detection framework while maintaining its high accuracy and low false positive rate. To that end, this work presents a framework in which three different feature selection techniques are combined to identify features that are crucial for the accurate detection of malicious/suspicious DNS domains. Moreover, the framework proposes the use of bagging ensemble classification models (that can reduce model variance) to further improve the accuracy of DNS typo-squatting detection.


The contributions of this work can be summarized as follows:

  • Proposing an ensemble feature selection method that selects the crucial features using multiple selection techniques.

  • Proposing a bagging ensemble classification model that identifies malicious/suspicious domain names with high accuracy.

  • Evaluating the performance of the proposed model in comparison to other traditional classification models.

Fig. 1: DNS Vulnerabilities and Challenges

The remainder of this paper is organized as follows: Section II describes the different security vulnerabilities that the DNS protocol faces. Section III summarizes the previous work in the literature. Section IV illustrates the proposed ensemble-based DNS typo-squatting detection framework and discusses its complexity. Section V

describes the dataset considered in this work and the data transformation/feature extraction process. Section

VI presents the experiment setup and discusses the corresponding results. Finally, Section VII concludes the paper.

Ii DNS Vulnerabilities & Challenges

As mentioned earlier, the DNS protocol has several security vulnerabilities due to the lack of data integrity and origin authentication within it. Fig.1 briefly lists some of the different vulnerabilities and attacks that DNS faces [7, 12, 16].

  1. DDoS attacks: Root DNS services are vulnerable to DDoS attacks. This is mainly due to the hierarchical architecture adopted. This is dangerous as it can cause a loss of availability of name resolution services which can lead to the stoppage of Internet service [8, 9, 17, 18, 19].

  2. Registrar hijacking: Malicious users can hijack a registrar. As a result, these users would control all the corresponding domain names. In turn, this can lead to enterprises and companies losing their domain names. One such example is the attack on a Brazilian Bank’s website [10]. As part of this attack, all the traffic to the bank’s website was redirected to the attackers own servers. This was possible because the attackers changed the DNS registrations of all the bank’s domains [10]. This attack had severe consequences with thousands of users being affected due to the leakage of sensitive information such as their banking, email, and FTP credentials [10].

  3. Cache Poisoning Problems: Cache poisoning is the result of the lack of data update propagation or invalidations mechanisms to DNS caches. Hence, cache poisoning can be achieved using Name Chaining or Transaction ID Prediction.

    1. Name Chaining: Attacker adds random DNS names in the DNS response which leads to the introduction of false information into the cache.

    2. Transaction ID Prediction: Attacker sends multiple DNS queries for domain names under his/her control. Then the attacker hopes that the transaction ID in one of the subsequent spoof replies matches the transaction ID that is used as part of the queries between the two servers.

  4. Man in the middle (MiTM) attacks: Attacks such as Packet Sniffing and Transaction ID Guessing are possible due to the fact that the DNS protocol does not offer a mechanism for servers to provide authentication details for the data sent to clients. This can result in a threat to the users’ privacy by directing them to suspicious or malicious domains and servers.

    1. Packet Sniffing: DNS reply packets can be intercepted and modified by the attacker.

    2. Transaction ID Guessing: Attackers that can correctly guess the transaction ID can send false replies to legitimate queries.

  5. Other DNS attacks: In addition to the attacks listed above, DNS is also prone to other types of attacks such as Information Leakage and Typo-squatting.

    1. Information Leakage (DNS Tunneling): As part of this attack, an attacker would leak sensitive information as part of DNS queries or their responses.

    2. Typo-squatting: This attack focuses on registering a domain name that highly matches that of an existing domain name in an attempt to confuse/fool users. This is dangerous as it can lead to information threat, corporate secret leakage, and can facilitate fraud [12].

This work mainly focuses on the typo-squatting attack. This is due to the severe consequences of such an attack including leakage of corporate secrets, leakage of sensitive personal information, and ultimately fraud [12]. Therefore, detecting such attacks through efficient and intelligent mechanisms is a necessity.

Iii Related Work

Securing the Internet has been a growing concern in recent years given the growth in the number of attacks witness on Internet services. Due to the abundance of data being collected by Internet service providers and network administrators, machine learning (ML)-based mechanisms have been proposed as a potential and viable efficient solution to help better detect attacks on Internet services. For example, intrusion and DDoS attack detection mechanisms using different classification algorithms such as artificial neural networks and support vector machines (SVM) have been proposed for Software-Defined Networks (SDNs)

[20]

. Similarly, an optimized ML-based anomaly detection framework was proposed that achieved high accuracy and low false alarm rate

[21]

. Additionally, decision tree (DT)-based algorithms have also been proposed as effective DDoS attack detection mechanisms in cloud computing environments

[22]. Similarly, a tree-based intrusion detection system for autonomous vehicles was proposed that achieves high detection rate with a low computational cost [23].
Few works in the literature explored the use of ML within the context of DNS security. Zhauniarovich et al. surveyed the state of the art work on malicious domain detection through DNS data anaylysis [24]. Bilge et al. proposed a DT-based classification model to detect malicious domains [16]. Similarly, Sivakorn et al. proposed the use of ML to detect malicious DNS queries [25]. Also, Sountharrajan et al.

used deep learning models to detect phishing URLs

[26]. On the other hand, Almusawi proposed an SVM model to detect DNS tunneling [27]. In contrast, Fukuda et al. proposed the use of ML to classify originator activity of DNS backscatter [28]. Weber et al. proposed the use of unsupervised clustering to identify malicious domain campaigns [29]. However, very few focused literature works focused on the DNS typo-squatting attack.

Iv Proposed Approach

Iv-a Proposed Approach

This paper extends our previous work in [1] by proposing an ensemble-based feature selection and bagging classification (EFSBC) model to detect DNS domain typo-squatting. This is done to reduce the complexity of the DNS typo-squatting detection framework while maintaining its high accuracy and low false positive rate. The proposed approach, as shown in Fig. 2, can be divided into three components, namely:

  1. Extract domain name representative features.

  2. Develop an ensemble feature selection model to identify crucial features.

  3. Develop a bagging ensemble classification model to detect malicious/suspicious domains.

Fig. 2: Proposed EFSBC Approach

Iv-B Proposed Approach Application

As mentioned earlier, the proposed approach can be divided into three main phases, namely the data transformation/feature extraction phase, feature selection phase (using three different feature selection mechanisms), and classification model phase. In the data transformation/feature extraction phase, a set of features representative of the typo-squatting attack are extracted. These features mainly focus on the domain name and its characteristic. Section V-B discusses this in more details.
The second phase, a sub-set of features are selected using different feature selection mechanisms to be given to the classification model as input. This is done in an attempt to reduce the complexity of the classification model and decrease its training time without sacrificing its performance [30]. This is particularly important when dealing with large scale systems generating big data [30]. Three different feature selection mechanisms are considered in this work representing three different categories of feature selection algorithms. The first algorithm is the correlation-based feature selection algorithm which belongs to the group of “Traditional Statistical” feature selection techniques [31]. The second algorithm is the information gain algorithm which belongs to the group of “Information Theory” techniques [32]. The third feature selection algorithm is the One Rule algorithm which is one of the “Decision Tree” based feature selection algorithms [33]. The results of these feature selection mechanisms are combined to produce a subset of features that are the highest ranked features from the dataset. The mathematical details of these algorithms are given in Section V-C.
After performing feature selection, the reduced feature dataset is given as an input to the bagging ensemble model classifier. In particular, a bagging ensemble model is chosen as it can reduce the classification variance of any base model while maintaining their low bias characteristics and improving the classification accuracy [34]. Therefore, a bagging ensemble classification model is adopted in this work with the aim of providing a domain typo-squatting detection framework with high accuracy and low false alarm rate.

Iv-C Complexity of Proposed Approach

The complexity of the proposed approach is dependent on the complexity of each of its phases, namely the feature extraction phase, the feature selection phase, and the classification model building phase. It is assumed that there are data samples and features. Accordingly, the complexity of the feature extraction phase is as the algorithm needs to go through all the dataset to extract the features.
The complexity of the feature selection phase depends on the complexity of each of the feature selection methods considered. The complexity of Correlation-based feature selection is to calculate all the class-feature and feature-feature correlations [35]. On the other hand, the complexity of information gain-based feature selection method is

to calculate the joint probabilities of the class-feature interaction

[36]. Finally, the complexity of One Rule algorithm is given that you have to determine the classification accuracy based on each feature [37]. Therefore, the overall complexity of the feature selection process is .
Finally, the complexity of the bagging ensemble classification model is dependent on the type of base learners used as part of the ensemble. In this work, the base learners considered are DTs and nearest neighbors (NN) due to their high accuracy as shown in [1]

. Given that building the bagging ensemble can be performed in parallel, the complexity of the DT-based bagging ensemble can be estimated as

and the complexity of nearest neighbor-based bagging ensemble can be estimated as [38, 39] where is the size of the reduced feature set.
By combining the computational complexity of the different phases knowing that , the overall complexity of the proposed approach is in the order of assuming that the DT-based bagging ensemble classification model is chosen. However, given that the feature selection process can be performed offline, the complexity of the proposed approach can be considered to be in the order of .

V Dataset Description

V-a Data Preprocessing:

The dataset under consideration in this work was originally collected by the authors of the “Data Driven Security” book [40]. The collection process consisted of a combination of Alexa’s top 1 million legitimate domains and Cryptolocker’s list of domains generated algorithmically (DGA) [41]. The resulting dataset is a list of 133,926 unique domains divided into 81,261 legitimate domains and 52,665 DGA domains. Each record consists of three fields as illustrated in Table I.

Field Description Example
Host Domain’s complete url www.mydaily.co.uk
Domain Actual domain accessed mydaily
Domain Class Domain classification Legit or DGA
TABLE I: Domain Features Description

V-B Data Transformation/Feature Extraction:

The dataset was transformed using MATLAB into a new dataset of eight features that characterize a unique domain name. More specifically, these features were chosen due to the nature of the typo-squatting attack which mainly focuses on modifying the domain name. All the features under consideration are numeric in nature. More specifically, the first four are integers and the remaining being continuous.
In addition to the extracted features, a binary feature representing the domain class was also added to the new dataset. In particular, a DGA domain was represented as 1 while a legitimate domain was represented as 0. Table II shows the value type and range of each of the aforementioned features.

Feature Value Type Range of Values
Length of Domain Name Numeric [1,2,…,68]
Number of Unique Characters Numeric [1,2,…,36]
Number of Unique Letters Numeric [1,2,…,26]
Number of Unique Numbers Numeric [0,1,…,10]
Ratio of Letters to Domain Length Numeric [0-1]
Ratio of Numbers to Domain Length Numeric [0-1]
Ratio of Unique Letters to Unique Characters Numeric [0-1]
Ratio of Unique Numbers to Unique Characters Numeric [0-1]
Domain Class Numeric [0,1]
TABLE II: Domain Features Description

V-C Feature Selection Techniques’ Background:

V-C1 Correlation-based Feature Selection


Correlation-based feature selection (CFS) is a simple algorithm that selects feature subsets based on their correlation with the class to be predicted [35]. In essence, CFS consider a feature to be relevant if it is correlated with or predictive of the class [35, 42]. CFS mainly uses Pearson’s correlation coefficient as its feature subset evaluation function. Accordingly, the evaluation function is [35]:

(1)

where:

  • : merit of the feature subset

  • : number of features in feature subset

  • : average class-feature Pearson correlation

  • : average feature-feature Pearson correlation

Using this equation, the feature subsets can be ranked and the subset with the highest correlation with the class to be predicted can be selected.

V-C2 Information Gain-based Feature Selection


Information gain-based feature selection is based on the use of information theory concepts such as entropy and mutual information [36]. This algorithm selects features based on the amount of information (in bits) that can be gained from these features. Accordingly, the feature evaluation function is [36]:

(2)

where:

  • : mutual information between feature subset and class

  • : entropy/uncertainty of discrete feature subset

  • : conditional entropy/uncertainty of discrete feature subset given class

  • : joint probability of feature having a value and class being

  • : probability of feature having a value

  • :probability of class being

Using these values, the information gained from each feature with respect to the class can be calculated and the highest features can be selected.

V-C3 One Rule-based Feature Selection


One rule, also commonly referred to as “OneR” or “1R”, algorithm is a simple one-level decision tree algorithm that creates one rule for each feature in the training data and provides an accuracy measure for that feature [43]. The main motivation is that such a feature selection algorithm can achieve high accuracy while still providing simple rules for humans to interpret and understand [43]. The algorithm can be summarized as follows [43]:
“For each feature
For each value of feature
Select set of instances where feature has a value
Let be the most frequent class in that set
Set the rule: If feature has value class is
Output the feature/rule with the highest classification accuracy.”
Using this algorithm, the classification accuracy of each feature can be calculated and the features can be selected.

Vi Experiment Results & Discussion

Vi-a Experiment Setup

MATLAB was used in this work to transform the data from its original state to the new desired dataset representing the previously provided features, perform the feature selection process, and train the corresponding bagging ensemble classification models.

Vi-B Results & Discussion

The experiment results are divided into two sections, namely the feature selection results and the bagging ensemble classification model results.

Vi-B1 Feature Selection


Tables III shows the feature ranking using correlation algorithm. Based on this metric, it can be observed in Table III that the features can be divided into two main subsets. The features within the first subset all have a correlation coefficient above 0.6 while the features in the second subset have a correlation coefficient less than 0.4.

Feature Correlation
Number of Unique Characters 0.663
Number of Unique Letters 0.653
Length of Domain Name 0.621
Number of Unique Numbers 0.329
Ratio of Numbers to Domain Length 0.281
Ratio of Unique Letters to Unique Characters 0.269
Ratio of Unique Numbers to Unique Characters 0.269
Ratio of Letters to Domain Length 0.242
TABLE III: Feature Selection Using Correlation

Similarly, Table IV shows the information gain of the different features. Again it can be observed that the first subset of features all have an information gain above 0.4 while the second subset has an information gain below 0.15.

Feature Information Gain
Length of Domain Name 0.5803
Number of Unique Characters 0.4486
Number of Unique Letters 0.4220
Ratio of Letters to Domain Length 0.1358
Number of Unique Numbers 0.1154
Ratio of Numbers to Domain Length 0.1096
Ratio of Unique Letters to Unique Characters 0.0952
Ratio of Unique Numbers to Unique Characters 0.0952
TABLE IV: Feature Selection Using Information Gain

The same observation can be seen in Table V which shows the class prediction accuracy of the different features. In this case, it is observed that the prediction accuracy of the first subset of features is higher than 80% while that of the second subset of features is lower than 70%.

Feature Accuracy of Rule
Length of Domain Name 85.8729
Number of Unique Letters 82.0148
Number of Unique Characters 81.9193
Number of Unique Numbers 68.276
Ratio of Numbers to Domain Length 68.0303
Ratio of Letters to Domain Length 67.9281
Ratio of Unique Letters to Unique Characters 67.5331
Ratio of Unique Numbers to Unique Characters 67.5308
TABLE V: Feature Selection Using OneR Classifier

These results re-iterate the results shown in [1] which illustrated that legitimate domains tend to have more memorable names. In contrast, DGA domains usually have more unique characters with the aim of increasing the randomness of the resulting domain name generated. Accordingly, the subset of features selected as input to the bagging ensemble classification model is made up of 3 features out of the 8 extracted (more than 50% reduction in the feature size), namely the length of the domain name, the number of unique characters, and the number of unique letters.

Vi-B2 Bagging Ensemble Classification Model Performance


As mentioned earlier, a bagging ensemble model was chosen as it can reduce the classification variance of any base model while maintaining its low bias characteristics and improving the classification accuracy [34]. Two different bagging ensemble models are considered, namely a decision-tree bagging ensemble classifier and a -NN bagging ensemble classifier. These base learners where chosen due to their superior performance as illustrated in [1]. Similar to our previous work [1]

, we use accuracy, precision, recall, and F-score as our performance metrics as per the equations in

[44]. Table VI shows the results of the two bagging ensemble models with the reduced feature set in comparison with the two base learners when the full list of features is used.

Algorithm Accuracy (%) Precision (%) Recall (%) F-score
C4.5 [1] 88.1 84.5 95.8 0.89
K-NN [1] 88.2 83.8 94.3 0.89
Majority-voting Ensemble Classifier [1] 88.4 85.5 71.5 0.89
DT Bagging Ensemble Classifier 87.7 79.2 93.1 0.85
-NN Bagging Ensemble Classifier 86.7 84.9 80.6 0.82
TABLE VI: Performance Evaluation of Classifiers

The results show that both the proposed decision-tree bagging ensemble classifier and -NN bagging ensemble classifier still maintain a high accuracy, precision, and F-score values despite being trained by a significantly smaller feature set. More specifically, we observe that the degradation is at most 1.5% in terms of accuracy and around 5% in terms of precision while using less than 50% of the feature set. This further emphasizes the efficiency of the proposed framework given that it was able to maintain the high accuracy and precision in identifying the malicious/suspicious domains while having a lower computational complexity.

Vii Conclusion & Future Works

Domain Name System (DNS) plays in important role in the current IP-based Internet architecture. This is because it performs the domain name to IP resolution. However, the DNS protocol has several security vulnerabilities due to the lack of data integrity and origin authentication within it [6, 7]. This work focused on one particular security vulnerability, namely typo-squatting. Typo-squatting refers to the registration of a domain name that is extremely similar to that of an existing popular brand with the goal of redirecting users to malicious/suspicious websites. This is dangerous as it can lead to information threat, corporate secret leakage, and can facilitate fraud. This work extended our previous work in [1] and proposed an ensemble-based feature selection and bagging classification model to detect DNS typo-squatting attack. Experimental results illustrated that the proposed framework achieves high accuracy and precision in identifying the malicious/suspicious typo-squatting domains (a loss of at most 1.5% in accuracy and 5% in precision when compared to the model that used the complete feature set) while having a lower computational complexity due to the smaller feature set (reduction of more than 50%).
Several potential research directions emerge to extend this work. One potential direction is collecting and exploring the impact of other features such as query sizes and timing. Another direction is studying the impact of a hybrid model that combines multiple techniques such as time series analysis and exploratory data analytics to further our understanding of the data behavior.

References

  • [1] A. Moubayed, M. Injadat, A. Shami, and H. Lutfiyya, “DNS Typo-Squatting Domain Detection: A Data Analytics & Machine Learning Based Approach,” in 2018 IEEE Global Communications Conference (GLOBECOM), Dec. 2018, pp. 1–7.
  • [2] P. Mockapetris and K. J. Dunlap, Development of the Domain Name System.    ACM, 1988, vol. 18, no. 4.
  • [3] M. A. Sharkh, M. Jammal, A. Shami, and A. Ouda, “Resource allocation in a network-based cloud computing environment: design challenges,” IEEE Communications Magazine, vol. 51, no. 11, pp. 46–52, Nov 2013.
  • [4] E. Aqeeli, A. Moubayed, and A. Shami, “Dynamic SON-Enabled Location Management in LTE Networks,” IEEE Transactions on Mobile Computing, vol. 17, no. 7, pp. 1511–1523, Jul. 2018.
  • [5] ——, “Power-Aware Optimized RRH to BBU Allocation in C-RAN,” IEEE Transactions on Wireless Communications, vol. 17, no. 2, pp. 1311–1322, Feb. 2018.
  • [6] A. Lioy, F. Maino, M. Marian, and D. Mazzocchi, “DNS security,” in TERENA Networking Conference, 2000.
  • [7] S. Ariyapperuma and C. J. Mitchell, “Security vulnerabilities in DNS and DNSSEC,” in Second International Conference on Availability, Reliability and Security (ARES’07), Apr. 2007, pp. 335–342.
  • [8] C. Liu, “Distributed-Denial-Of-Service Attacks And DNS,” Nov. 2017. [Online]. Available: http://www.forbes.com/sites/forbestechcouncil/2017/11/15/distributed-denial-of-service-attacks-and-dns/#67ccd2de6076
  • [9] N. Woolf, “DDoS Attack That Disrupted Internet Was Largest of its Kind in History, Experts Say,” Oct. 2016. [Online]. Available: http://www.theguardian.com/technology/2016/oct/26/ddos-attack-dyn-mirai-botnet
  • [10] A. Greenberg, “How Hackers Hijacked A Bank’s Entire Online Operation,” Apr. 2017. [Online]. Available: http://www.wired.com/2017/04/hackers-hijacked-banks-entire-online-operation/
  • [11] O. Lystrup, “Cybersquatting on the 2016 Presidential Campaign Trail,” Feb. 2016. [Online]. Available: http://umbrella.cisco.com/blog/2016/02/25/typosquatting-on-the-2016-presidential-campaign-trail/
  • [12] R. Mohan, “Five DNS Threats You Should Protect Against,” Oct. 2011. [Online]. Available: http://www.securityweek.com/five-dns-threats-you-should-protect-against
  • [13] V. Sucasas, G. Mantas, A. Radwan, and J. Rodriguez, “An oauth2-based protocol with strong user privacy preservation for smart city mobile e-health apps,” in 2016 IEEE International Conference on Communications (ICC), May 2016, pp. 1–6.
  • [14] M. Larson, D. Massey, S. Rose, R. Arends, and R. Austein, “DNS Security Introduction and Requirements,” 2005.
  • [15] R. Curtmola, A. Del Sorbo, and G. Ateniese, “On the Performance and Analysis of DNS Security Extensions,” in International Conference on Cryptology and Network Security.    Springer, 2005, pp. 288–303.
  • [16] L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi, “EXPOSURE: Finding Malicious Domains Using Passive DNS Analysis,” in 18th Annual Network and Distributed System Security Symposium (NDSS’11), Feb. 2011. [Online]. Available: http://www.eurecom.fr/publication/3281
  • [17] A. Moubayed, A. Refaey, and A. Shami, “Software-defined perimeter (sdp): State of the art secure solution for modern networks,” IEEE Network, vol. 33, no. 5, pp. 226–233, Sep.- Oct. 2019.
  • [18] A. Sallam, A. Refaey, and A. Shami, “On the security of sdn: A completed secure and scalable framework using the software-defined perimeter,” IEEE Access, vol. 7, pp. 146 577–146 587, 2019.
  • [19] P. Kumar, A. Moubayed, A. Refaey, A. Shami, and J. Koilpillai, “Performance analysis of sdp for secure internal enterprises,” in 2019 IEEE Wireless Communications and Networking Conference (WCNC), Apr. 2019, pp. 1–6.
  • [20] J. Ashraf and S. Latif, “Handling intrusion and DDoS attacks in Software Defined Networks using machine learning techniques,” in 2014 National Software Engineering Conference, Nov. 2014, pp. 55–60.
  • [21] M. Injadat, F. Salo, A. B. Nassif, A. Essex, and A. Shami, “Bayesian optimization with machine learning algorithms towards anomaly detection,” in 2018 IEEE Global Communications Conference (GLOBECOM), Dec 2018, pp. 1–6.
  • [22] M. Zekri, S. E. Kafhali, N. Aboutabit, and Y. Saadi, “DDoS attack detection using machine learning techniques in cloud computing environments,” in 2017 3rd International Conference of Cloud Computing Technologies and Applications (CloudTech’17), Oct. 2017, pp. 1–7.
  • [23] L. Yang, A. Moubayed, I. Hamieh, and A. Shami, “Tree-based intelligent intrusion detection system in internet of vehicles,” in Accepted in 2019 IEEE Global Communications Conference (GLOBECOM), Dec 2019, pp. 1–6.
  • [24] Y. Zhauniarovich, I. Khalil, T. Yu, and M. Dacier, “A survey on malicious domains detection through dns data analysis,” ACM Computing Surveys (CSUR), vol. 51, no. 4, pp. 1–36, 2018.
  • [25] S. Sivakorn, K. Jee, Y. Sun, L. Kort-Parn, Z. Li, C. Lumezanu, Z. Wu, L.-A. Tang, and D. Li, “Countering malicious processes with process-dns association.” in NDSS, 2019.
  • [26] S. Sountharrajan, M. Nivashini, S. K. Shandilya, E. Suganya, A. B. Banu, and M. Karthiga, “Dynamic recognition of phishing urls using deep learning techniques,” in Advances in Cyber Security Analytics and Decision Systems.    Springer, 2020, pp. 27–56.
  • [27] A. Almusawi and H. Amintoosi, “Dns tunneling detection method based on multilabel support vector machine,” Security and Communication Networks, vol. 2018, 2018.
  • [28] K. Fukuda, J. Heidemann, and A. Qadeer, “Detecting malicious activity with dns backscatter over time,” IEEE/ACM Transactions on Networking, vol. 25, no. 5, pp. 3203–3218, Oct. 2017.
  • [29] M. Weber, J. Wang, and Y. Zhou, “Unsupervised clustering for identification of malicious domain campaigns,” in Proceedings of the First Workshop on Radical and Experiential Security, 2018, pp. 33–39.
  • [30] M. B. Çatalkaya, O. Kalıpsız, M. S. Aktaş, and U. O. Turgut, “Data feature selection methods on distributed big data processing platforms,” in 2018 3rd International Conference on Computer Science and Engineering (UBMK), Sep. 2018, pp. 133–138.
  • [31] J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, “Feature selection: A data perspective,” ACM Computing Surveys (CSUR), vol. 50, no. 6, p. 94, 2018.
  • [32] R. S. B. Krishna and M. Aramudhan, “Feature Selection Based on Information Theory for Pattern Classification,” in 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), Jul. 2014, pp. 1233–1236.
  • [33] K. Kumar, G. Kumar, and Y. Kumar, “Feature selection approach for intrusion detection system,” International Journal of Advanced Trends in Computer Science and Engineering (IJATCSE), vol. 2, no. 5, pp. 47–53, 2013.
  • [34] H. Y. Aydogmus, H. Erdal, O. Karakurt, E. Namli, Y. S. Turkan, and H. Erdal, “A comparative assessment of bagging ensemble models for modeling concrete slump flow,” Computers and Concrete, vol. 16, no. 5, pp. 741–757, 2015.
  • [35] M. A. Hall, “Correlation-based feature selection for machine learning,” Ph.D. dissertation, University of Waikato Hamilton, 1999.
  • [36] B. Bonev, “Feature selection based on information theory,” Ph.D. dissertation, University of Alicante, Jun. 2010.
  • [37] R. C. Holte, “Very simple classification rules perform well on most commonly used datasets,” Machine learning, vol. 11, no. 1, pp. 63–90, 1993.
  • [38] The Kernel Trip, “Computational complexity of machine learning algorithms,” Apr. 2018.
  • [39] C.-T. Chu, S. K. Kim, Y.-A. Lin, Y. Yu, G. Bradski, K. Olukotun, and A. Y. Ng, “Map-reduce for machine learning on multicore,” in Advances in neural information processing systems, 2007, pp. 281–288.
  • [40] J. Jacobs and B. Rudis, Data-Driven Security: Analysis, Visualization and Dashboards.    John Wiley & Sons, 2014.
  • [41] J. Jacobs, “Building a DGA Classifier: Part 1, Data Preparation,” Sep. 2014. [Online]. Available: http://datadrivensecurity.info/blog/posts/2014/Sep/dga-part1/
  • [42] J. H. Gennari, P. Langley, and D. Fisher, “Models of incremental concept formation,” Artificial intelligence, vol. 40, no. 1-3, pp. 11–61, 1989.
  • [43] G. Holmes and C. G. Nevill-Manning, “Feature selection via the discovery of simple classification rules,” 1995.
  • [44] A. Sharma and R. Rani, “Classification of cancerous profiles using machine learning,” in

    International Conference on Machine Learning and Data Science (MLDS’17)

    , Dec. 2017, pp. 31–36.