Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis
Deep learning is a state of the art method for a lot of applications. The main issue is that most of the real-time data is highly imbalanced in nature. In order to avoid bias in training, cost-sensitive approach can be used. In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated on three different Cyber Security use cases which are Domain Generation Algorithm (DGA), Electronic mail (Email), and Uniform Resource Locator (URL). Various experiments were performed using cost-insensitive as well as cost-sensitive methods and parameters for both of these methods are set based on hyperparameter tuning. In all experiments, the cost-sensitive deep learning methods performed better than the cost-insensitive approaches. This is mainly due to the reason that cost-sensitive approach gives importance to the classes which have a very less number of samples during training and this helps to learn all the classes in a more efficient manner.
READ FULL TEXT