Machine learning techniques have proven to be very useful in networking  in general and security related topics in particular [2,3,4]. Deep learning [5-7], on the other hand, has improved the state-of-the-art for many machine learning tasks such as speech recognition, objection detection and natural language understanding. The main advantage is that this approach allows learning very complex functions by a general-purpose learning approach . It is worth noting that we exclude deep intrusion detection systems from this study due to the fact that many other researchers have already conducted a similar survey to cover that particular subject [8,9].
Ii Deep learning in security
i Malware Detection
Tobiyama et al.  proposed malware process detection based on process behavior. The authors used long short term memory (LSTM) for feature extraction and convolutional neural network (CNN) for classification. A process behavior is a sequence of API calls. The features were extracted from the process behavior log files which were transferred to an image that contains local features. Theses local features mostly represents the process activities. Therefore, one can apply CNN in order to capture these local features and correctly classify these images. An overview of the proposed method in  is shown in Figure 1. In the experimental study the authors used 81 malware process log files and 69 benign process log files for the training and validation stages. The authors executed the malware files in the Cuckoo Sandbox and traced the process behavior in order to determine the produced and injected processes. The authors also validated the classifier with 5-fold cross validation. The best result (AUC= 0.96) achieved when the feature image size was 30x30.
Rhode et al.  investigated the possibility of prediction whether an executable is malicious or not based on behavioural data. The study showed that the model achieved high level of accuracy (94%) based only on the first 5 seconds of the file execution using 3000 ransomware samples without prior exposure to these samples. In addition, the model achieved an accuracy of 96% in less than 10 seconds. The selected features were 10 continuous machine activity data metrics instead of using categorical API calls. Due to the fact that API calls can be manipulated which may lead to incorrect classification for the input samples. In addition, continous data allows a large number of states to be represented in a small vector. As shown in Figure 2, Cuckoo Sandbox was used in order to collect activity data of Portable Executable (PE) samples. The collected features are: system CPU usage, user CPU use, packets sent, packets received, bytes sent, bytes received, memory use, swap use, the total number of processes currently running and the maximum process ID assigned. The authors used gated recurrent unit (GRU) in stead of LSTM cells due to their training speed. The authors stated that the model should be re-trained regularly with newly discovered samples, which may lead to adjustments in the proposed architecture too.
Chen et al.  proposed, HeNet, a hierarchical ensemble neural network based on control flow characterization of program execution. As shown in Figure 3, HeNet consists of a low-level behavior model and a top-level ensemble model. HeNet was tested against ROP attacks against Adobe Reader 9.3 in Windows R7 32 bit. The dataset used in the experimental study consists of 348 benign and 299 malicious PDF samples. HeNet achieved high level of accuracy (100%) and zero level of false positives (0%). In addition to a higher classification accuracy when compared to traditional machine learning algorithms.
Hardy et al.  proposed a malware detection model based on stacked AutoEncoders (SAEs). The model uses Windows API calls produced from the collected PE files. As shown on Figure 4, the PE parser is used to extract the Windows API calls from each file. The API query database, converts the API calls to a 32-bit representation for the corresponding API functions. Thereafter, the SAE is used in order to perform feature learning, fine-tuning and malware detection. The experimental study conducted on large dataset collected from Comodo Cloud Security Center. The dataset contained 50000 file samples, where 22500 are malware, 22500 are benign files, and 5000 are unknown. The proposed model outperformed ANN, SVM, NB, and DT in malware detection with 96.85% level of accuracy.
Hou et al. , proposed an Android malware detection framework based on Linux kernel system calls and stacked AutoEncoders (SAEs). A new dynamic analysis method named Component Travelsal was introduced for automatically execution of the code routines of each given Android app. In order to capture the relationships among the system calls, a weighted directed graph was constructed where each graph node represents a system call and its size indicates its frequency, whereas a directed edge indicates the sequential flow of system calls made and includes a weight that implies the frequency the successor node called after the predecessor node. Each node with its weight, each edge with its weight, as well as the in-degree and out-degree of each of the nodes is used as the input feature for the proposed deep learning model. The experimental study was conducted on large dataset collected from Comodo Cloud Security Center. The dataset contains 3000 android apps, half of which are benign, while the other half are malicious. Compared with traditional machine learning methods, the detection performance is enhanced by using deep learning framework with a high level of accuracy (93.68%).
Yuan et al. , proposed, DroidDetector, an online deep-learning based Android malware detection engine. The authors conducted static and dynamic analyses to extract features from each app. The extracted features fall into three main categories: (1) required permissions, (2) sensitive APIs and (3) dynamic behaviors. DroidDetector achieved 96.76% detection accuracy, which outperforms traditional machine learning techniques. In the static phase includes parsing the two files AndroidManifest.xml and classes.dex in order to obtain a total of 120 permissions required by the app. The dynamic phase includes running each app in DroidBox in order to execute a dynamic taint analysis and monitor a total of 13 of the app actions. As shown in Figure 6, the deep learning model used in this study consists of two phases, the unsupervised pretraining phase and supervised back-propagation phases. the pre-training phase, the Deep Belief Networks (DBN) is hierarchically built by stacking a number of Restricted Boltzmann Machines (RBM). In the back-propagation phase, the pre-trained DBN was tuned with labeled samples in a supervised way. The experimental study was conducted on three public app sets. The first benign app set was randomly crawled from the Google Play Store and contains a total of 20000 apps. The other two malicious app sets were respectively collected from the Contagio Community (there are only about 400 apps) and Genome Project (including 1260 malicious apps).
Huang and Stokes  proposed, MtNet, a multi-task deep learning malware classification architecture which is trained for two tasks including binary classification which predicts whether an unknown file is malicious or benign and 100-class family classification which predicts if the file belongs to one of 98 important families, a generic malware class, or the benign class. The authors used low-level features extracted from dynamic analysis of the file as input for the training stage. These features are a sequence of application programming interface (API) call events plus their parameters and a sequence of null-terminated objects recovered from system memory during emulation. The final number of selected features was reduced to 50000 based on mutual information feature selection. Training a neural network with this large input dimension is computationally intensive. Therefore, the authors used random projection technique in order to reduce the data size. MtNet trained and tested on an extremely large dataset consisting of 6.5 million, 2.85 million examples were extracted from malicious files and 3.65 million from benign files. The set of malicious files has 1.3 million belonging to the 98 malware families and 1.55 million from the generic malware class. A set of 4.5 million samples were used for training and another set of 2.0 million for a hold out test set. The experimental results showed that MtNet achieved a binary malware error rate of 0.358% and family error rate of 2.94%. The multi-task learning improved the classification results and showed low false positive rates (under 0.07%).
Azmoodeh et al. , proposed a deep Eigenspace learning approach to classify malicious and bening IoT applications. They extracted OpCode sequence of 1078 benign and 128 malware from ARM compatible IoT applications. The selected features (OpCodes) of each sample are converted into a graph which is for classification based on deep convolutional networks. They used Objdump in order to extract the OpCodes. Thereafter, one can use n-gram Op-Code sequence to classify malware using their disassembled codes. In addition, they proposed class-wise information gain to overcome the problem of imbalanced datasets and select the top 82 features. As shown in Figure 8, the proposed approach consists of 2 phases: (1) OpCode-Sequence Graph Generation phase and (2) Deep Eigensapce Learning phase. Eigensapce learning was proposed due to the fact that Eigenvectors and eigenvalue are two main components in the graph’s spectrum. The first two eigenvectors and eigenvalues of the samples are used as input values for the model. The proposed system achieved an accuracy of 98.37% and a precision rate of 98.59%. In addition to, the ability to mitigate junk code insertion attacks.
Kolosnjaji et al. , proposed a classification of malware system call sequences based on convolutional and recurrent network layers. As shown in Figure 9, the authors combined convolutional and recurrent layers in one neural network in which the convolutional layer is used for feature extraction. The input of the system is 60 distinct system calls. The dataset used in study was collected from three different sources: Virus Share, Maltrieve and private collections. Sample labels are obtained using services of VirusTotal where the hash of the malware file compared to the service’s database. Then the authors performed clustering on the signatures from different antivirus programs in order to obtain ground truth classes from the resulting clusters where these clusters contain 4753 malware samples. The experimental study showed that the proposed approach outperformed the traditional machine learning approaches and achieved an average of 85.6% on precision and 89.4% on recall.
Rigaki and Garcia , showed that malware behavioral patterns can be modified using GANs in order to mimic Facebook chat. The real-life testing scenario was developed using the Stratosphere behavioral IPS in a router, whereas the malware and the GAN were deployed in the local network, and the command & control server was deployed in the cloud. The authors used an open source Remote Access Trojan (RAT), called Flu, which was modified in order to receive the input from the generator and adapt its behavior accordingly. Flu inquires the GAN using the HTTP API, which in turn reply with three parameters: total byte size, duration of the next network flow, and time delta between the current flow and the next one.
ii Botnet Detection
Botshark  is a deep learning-based approach for botnet traffic analyzing from both centralized and P2P botnets. Botshark consists of two deep structures where the first layer employs stacked Autoencoders for feature extraction and the second layer uses CNNs in order to train a classifier for botnet detection. The authors used network traffic from ISCX dataset  which includes 44.97% malicious flows from 16 different centralized and decentralized botnet topologies as well as normal traffic. The authors extracted NetFlows from network traffic using the Argus  tool. The extracted features contain: (1) byte-based features, (2) time-based features, (3) packet-based features. The experimental study showed 0.91% true positive rate with 0.13% of false positive rate.
The authors of  used LSTM approach to model the behavior of network traffic as a sequence of states that changes over time. The authors used LSTM approach to model the behavior of network traffic as a sequence of states that changes over time. The behavior of a connection is computed based on three features of each flow: size, duration and periodicity. Then, assigning to each flow a state symbol based on the extracted features and an assignment strategy. Finally, each connection will have its own string of symbol used for represents its behavior. The proposed LSTM architecture was evaluated against two different datasets where the first one was used for training and the second one was used for testing purpose. The authors used sampling approach for dealing with the imbalanced datasets. They also studied the optimal length of connection states required for the input layer in LSTM and consequently the best results were achieved by 25 connection states. The LSTM model showed 0.809% attack detection rate with 0.03% of false alarm rate when tested on the second dataset. The experimental results showed that this approach was able to detect TCP based malicious connections. However, it failed in identifying most of the HTTP and HTTPS traffic. In addition to some of SMTP (SPAM) traffic. A solution for this problem, is observe the network traffic across multiple layers of the network and monitor the sequences of bot activities together which possibly can be merged based on a context and aim .
In , Yin et al. used generative adversarial networks (GANs) for generating ’fake’ samples in order to expand the number of labeled samples where a 3-layer LSTM network was used as generator and the discriminator was replaced with a botnet detector. The authors selected 16 flow-based features. The experimental study was conducted on ISCX botnet dataset. The false positive rate of the GAN-based model was decreased from 19.19% to 15.59%.
Tran et al.  proposed a novel LSTM approach in order to detect modern sophisticated botnets that employ domain generation algorithm (DGA) to generate a large number of domains that can be used in order to communicated with Command and Control (C&C) server. DGA classification can be seen as a multiclass task , which can be either retrospective or real-time manner. The real time detection which is mainly based on the domain name and linguistic features. This is much difficult due to the fact that linguistic features it can be bypassed by the malware author . In general, LSTM is sensitive to the multiclass imbalance problem. In other words, it is naturally biased towards the prevalent classes, which results in an inability of detecting the uncommon other DGA families . The new proposed algorithm adopts the cost-sensitivity principle  in order to target the class imbalance problem, by making the learning biased towards the small classes. The authors modified the backward pass (i.e. the error computation) of the original LSTM. This new algorithm has showed an enhancement of 7% in terms of macro-averaging recall, precision and F1-score with respect to the original LSTM and other state-of-the-art solutions. The cost-sensitive LSTM, however, reduced the accuracy on the prevalent non-DGA class. It is also worth mentioning that cost-sensitive LSTM is superior to RUSBoost, oversampling and Threshold-moving methods and achieved much higher macro-averaging F1-score with respect to HMM, C5.0, LSTM, the cost-sensitive SVM, cost-sensitive C4.5 and Weighted Extreme Learning Machine on the multiclass imbalanced dataset .
On the other hand, in the context of social bot detection for social media platforms, traditional deep learning techniques for text classification depended mainly on only textual features [28,29]. However, employing additional features can be more efficient and shows better results. In  deep neural network based on contextual (LSTM) architecture that exploits both content and metadata in order to detect social bots at the tweet level. Contextual LSTM is a natural language processing (NLP) model based on deep learning paradigm. The dataset  includes over 8386 accounts and 11834866 tweets. The authors used both 10 account-level features for the first level and 6 tweet level features for the second level. This approach showed a high level of accuracy (> 99% AUC) for user-level detection.
iii Malicious Code Detection
Hendler et al.  used deep learning model for detecting malicious PowerShell commands based on character-level convolutional neural networks. The authors used a large dataset which consists of 66388 distinct PowerShell commands; 6290 labeled as malicious and 60098 labelled as benign. The authors treated the command as a raw signal at character level and applied to it a one dimensional CNN for text classification task. The input feature length is 1,024, therefore if a command is longer than that it will be truncated. The best performance was achieved by an ensemble detector that combines a traditional NLP-based classifier with a CNN-based classifier.
Saxe and Berlin  proposed, eXpose, a convolutional neural network approach for detecting malicious URLs, file paths and registry keys. The training dataset used in this research consists of a total of 14788254 instances: 8332360 benign and 6455894 malicious samples whereas the testing dataset consists of a total of 11531955 instances: 10121981 benign and 1409974 malicious samples. The authors used n-gram feature extractor and manually extracted features. These features were randomly hashed into 1024-dimensional vector. Thereafter, they were fed into a deepMLP model. eXpose outperformed manual feature extraction approaches, achieving a 5%-10% detection rate gain at 0.1% false positive rate compared to these baselines.
This paper gives an overview of the potential of deep learning techniques in the security field. We mainly observed the high accuracy and efficiency of deep learning models in malware and botnet detection. In addition to their application in malicious code detection. These models showed a significant improvement when compared to tradition machine learning approaches. Therefore, we conclude that these models will have an increased adoption in modern security solutions and applications.
-  Latah, M., Toker, L.: ‘Artificial Intelligence Enabled Software Defined Networking: A Comprehensive Overview’, 2018, arXiv preprint arXiv:1803.06818.
-  Xiao, L., Wan, X., Lu, X., Zhang, Y. Wu, D.: ‘IoT Security Techniques Based on Machine Learning’,2018, arXiv preprint arXiv:1801.06275.
-  Li, J., Zhao, Z., Li, R.: ‘A Machine Learning Based Intrusion Detection System for Software Defined 5G Network’, IET Networks, 2018, 7,(2), pp. 53-60. DOI: 10.1049/iet-net.2017.0212
-  Mishra, P., Varadharajan, V., Tupakula, U. Pilli, E.S.: ‘A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion Detection’, IEEE Communications Surveys & Tutorials, 2018, DOI: 10.1109/COMST.2018.2847722
-  LeCun, Y., Bengio, Y., Hinton, G.: ‘Deep learning. nature’, 2016, 521(7553), pp. 436-444, DOI: 10.1038/nature14539
-  Deng, L.: ‘A tutorial survey of architectures, algorithms, and applications for deep learning’, APSIPA Transactions on Signal and Information Processing, 2014, 3,(e2), pp. 1-19. DOI: 10.1017/atsip.2013.9
-  Deng, L., Yu, D.: ‘Deep learning: methods and applications. Foundations and Trends in Signal Processing’, 2014, 7,(3–4), pp.197-387. DOI: 10.1561/2000000039
Kwon, D., Kim, H., Kim, J., Suh, S.C., Kim, I. Kim, K.J.: ‘A survey of deep learning-based network anomaly detection’, Cluster Computing, 2017, pp.1-13. DOI: 10.1007/s10586-017-1117-8
-  Hodo, E., Bellekens, X., Hamilton, A., Tachtatzis, C., Atkinson, R.: Shallow and deep networks intrusion detection system: A taxonomy and survey, 2017, arXiv preprint arXiv:1701.02145.
-  Tobiyama, S., Yamaguchi, Y., Shimada, H., Ikuse, T. and Yagi, T., 2016, June. ‘Malware detection with deep neural network using process behavior. In IEEE 40th Annual Conference on Computer Software and Applications, Atlanta, USA, June 2016, pp. 577-582.
Rhode, M., Burnap, P., Jones, K., ‘Early-stage malware prediction using recurrent neural networks’, Computers & Security, 2018, 77, pp.578-594.
-  Chen, L., Sultana, S. and Sahita, R., ‘HeNet: A Deep Learning Approach on Intel Processor Trace for Effective Exploit Detection, 2018, arXiv preprint arXiv:1801.02318.
-  Hardy, W., Chen, L., Hou, S., Ye, Y. and Li, X., 2016, January. ‘DL4MD: A deep learning framework for intelligent malware detection’. In Proceedings of the International Conference on Data Mining (DMIN). The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), Las Vegas, USA, July 2016, pp. 61-67.
-  Hou, S., Saas, A., Chen, L., Ye, Y., 2016, October. ‘Deep4maldroid: A deep learning framework for android malware detection based on linux kernel system call graphs’. In 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW), Omaha, USA, Oct. 2016, pp. 104-111.
-  Yuan, Z., Lu, Y. and Xue, Y., ‘Droiddetector: android malware characterization and detection using deep learning’, Tsinghua Science and Technology, 2016, 21,(1), pp.114-123, DOI: 10.1109/TST.2016.7399288
-  Huang, W. and Stokes, J.W., 2016, July. ‘MtNet: a multi-task neural network for dynamic malware classification’. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, San Sebastian, Spain, July 2016, pp. 399-418.
-  Azmoodeh, A., Dehghantanha, A., Choo, K.K.R.: ‘Robust Malware Detection for Internet Of (Battlefield) Things Devices Using Deep Eigenspace Learning’, IEEE Transactions on Sustainable Computing, 2018. DOI: 10.1109/TSUSC.2018.2809665
Kolosnjaji, B., Zarras, A., Webster, G. Eckert, C.: ‘Deep learning for classification of malware system call sequences’. In Australasian Joint Conference on Artificial Intelligence, Hobart, Australia, December 2016, pp. 137-149.
-  Rigaki, M., Garcia, S.: ‘Bringing a GAN to a Knife-fight: Adapting Malware Communication to Avoid Detection’. 1st deep learning and security workshop, San Francisco, USA, May 2016.
-  Homayoun, S., Ahmadzadeh, M., Hashemi, S., Dehghantanha, A., Khayami, R.: ‘BoTShark: A deep learning approach for botnet traffic detection’, Cyber Threat Intelligence, 2018, pp.137-153, DOI: 10.1007/978-3-319-73951-9_7
-  Unb iscx botnet dataset, 2017. http://www.unb.ca/cic/datasets/index.html
-  Argus- auditing network activity, jan 2017, http://qosient.com/argus
-  Torres, P., Catania, C., Garcia, S., Garino, C.G.: ‘An analysis of recurrent neural networks for botnet detection behavior’. In Biennial Congress of Argentina (ARGENCON), Buenos Aires, Argentina, June 2016, pp. 1-6.
-  Acarali, D., Rajarajan, M., Komninos, N., Herwono, I.: ‘Survey of approaches and features for the identification of HTTP-based botnet traffic’, Journal of Network and Computer Applications, 2016, 76, pp.1-15. DOI: 10.1016/j.jnca.2016.10.007
-  Yin, C., Zhu, Y., Liu, S., Fei, J., Zhang, H.: ‘An enhancing framework for botnet detection using generative adversarial networks’. In IEEE 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, May 2018, pp. 228-234.
-  Tran, D., Mac, H., Tong, V., Tran, H.A., Nguyen, L.G.: ‘A LSTM based framework for handling multiclass imbalance in DGA botnet detection’, Neurocomputing, 2018, 275, pp.2401-2413. DOI: 10.1016/j.neucom.2017.11.018
-  Schiavoni, S., Maggi, F., Cavallaro, L. and Zanero, S.: ‘Phoenix: DGA-based botnet tracking and intelligence’. in: Proceedings of the International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), Egham, UK, July 2014, pp. 192–211 .
-  Kukar, M., Kononenko, I.: ‘Cost-sensitive learning with neural networks’. In: Proceedings of the Thirteenth European Conference on Artificial Intelligence (ECAI), Brighton, UK, Aug. 1998, pp. 445–449.
-  Kudugunta, S., Ferrara, E., ‘Deep Neural Networks for Bot Detection’, 2018, arXiv preprint arXiv:1802.04289.
-  Ferrara, E., Varol, O., Davis, C., Menczer, F. and Flammini, A.: ‘The rise of social bots’, Commun. ACM, 2016, 59, (7), pp. 96-104. DOI: 10.1145/2818717
-  Hendler, D., Kels, S. Rubin, A.: ‘Detecting Malicious PowerShell Commands using Deep Neural Networks’. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, Incheon, Republic of Korea, June 2018, pp. 187-197.
-  Saxe, J. and Berlin, K.: ‘eXpose: A character-level convolutional neural network with embeddings for detecting malicious URLs, file paths and registry keys’, 2017, arXiv preprint arXiv:1702.08568.