Optimized Random Forest Model for Botnet Detection Based on DNS Queries
The Domain Name System (DNS) protocol plays a major role in today's Internet as it translates between website names and corresponding IP addresses. However, due to the lack of processes for data integrity and origin authentication, the DNS protocol has several security vulnerabilities. This often leads to a variety of cyber-attacks, including botnet network attacks. One promising solution to detect DNS-based botnet attacks is adopting machine learning (ML) based solutions. To that end, this paper proposes a novel optimized ML-based framework to detect botnets based on their corresponding DNS queries. More specifically, the framework consists of using information gain as a feature selection method and genetic algorithm (GA) as a hyper-parameter optimization model to tune the parameters of a random forest (RF) classifier. The proposed framework is evaluated using a state-of-the-art TI-2016 DNS dataset. Experimental results show that the proposed optimized framework reduced the feature set size by up to 60 precision, recall, and F-score compared to the default classifier. This highlights the effectiveness and robustness of the proposed framework in detecting botnet attacks.
READ FULL TEXT