Security Orchestration, Automation, and Response Engine for Deployment of Behavioural Honeypots

01/14/2022
by   Upendra Bartwal, et al.
0

Cyber Security is a critical topic for organizations with IT/OT networks as they are always susceptible to attack, whether insider or outsider. Since the cyber landscape is an ever-evolving scenario, one must keep upgrading its security systems to enhance the security of the infrastructure. Tools like Security Information and Event Management (SIEM), Endpoint Detection and Response (EDR), Threat Intelligence Platform (TIP), Information Technology Service Management (ITSM), along with other defensive techniques like Intrusion Detection System (IDS), Intrusion Protection System (IPS), and many others enhance the cyber security posture of the infrastructure. However, the proposed protection mechanisms have their limitations, they are insufficient to ensure security, and the attacker penetrates the network. Deception technology, along with Honeypots, provides a false sense of vulnerability in the target systems to the attackers. The attacker deceived reveals threat intel about their modus operandi. We have developed a Security Orchestration, Automation, and Response (SOAR) Engine that dynamically deploys custom honeypots inside the internal network infrastructure based on the attacker's behavior. The architecture is robust enough to support multiple VLANs connected to the system and used for orchestration. The presence of botnet traffic and DDOS attacks on the honeypots in the network is detected, along with a malware collection system. After being exposed to live traffic for four days, our engine dynamically orchestrated the honeypots 40 times, detected 7823 attacks, 965 DDOS attack packets, and three malicious samples. While our experiments with static honeypots show an average attacker engagement time of 102 seconds per instance, our SOAR Engine-based dynamic honeypots engage attackers on average 3148 seconds.

READ FULL TEXT VIEW PDF

page 1

page 7

05/08/2019

Convolutional Neural Network for Intrusion Detection System In Cyber Physical Systems

The extensive use of Information and Communication Technology in critica...
08/17/2022

DF-Captcha: A Deepfake Captcha for Preventing Fake Calls

Social engineering (SE) is a form of deception that aims to trick people...
06/08/2021

Analysis of Attacker Behavior in Compromised Hosts During Command and Control

Traditional reactive approach of blacklisting botnets fails to adapt to ...
02/15/2020

Security of HyperLogLog (HLL) Cardinality Estimation: Vulnerabilities and Protection

Count distinct or cardinality estimates are widely used in network monit...
07/11/2022

PowerDuck: A GOOSE Data Set of Cyberattacks in Substations

Power grids worldwide are increasingly victims of cyberattacks, where at...
07/10/2021

Cyber-Security Challenges in Aviation Industry: A Review of Current and Future Trends

The integration of Information and Communication Technology (ICT) tools ...
01/17/2022

Silently Disabling ECUs and Enabling Blind Attacks on the CAN Bus

The CAN Bus is crucial to the efficiency, and safety of modern vehicle i...

I Introduction

Honeypots have been in use since the late 90s and are employed to deceive attackers by providing vulnerable and fake systems. It provides an extra layer of deception to the organizations. Honeypots are such a tool that organizations can deploy and weaponize to defend themselves and gather threat intel about the types of attacks. Some challenges that arise when honeypots are deployed:

  1. Studies have shown that the number of attacks on static honeypots decreases with a longer deployment period[Sehgal2020]. Hence, gathering threat intelligence becomes challenging.

  2. Detecting attackers inside an organization’s internal networks becomes very difficult as the insider attackers are well trained with the systems in place at their organization and can distinguish a honeypot from a real system faster than an outsider attacker. This means that the deception systems need to be robust and intelligent enough to deceive them.

  3. Engaging the attackers becomes an arduous task if honeypots are not deployed dynamically.

Attacks in IT/OT infrastructure are very organization-specific; hence customized deception is the need of the hour. Though there are different open-source threat intelligence feeds, they are insufficient in providing organization-specific threat intelligence. To protect the organization from insider attacks, we need organization-specific deception technology. The best way to implement it is to create a generalized deception solution and then customize it for organizations. We have developed a Security Orchestration, Automation, and Response Engine that dynamically deploys honeypots according to the attacker’s behavior in internal networks. We observed that dynamically deployed honeypots attract more attackers. The attacker spends more time exploring them, which results in attacker engagement in the honeypots, hence, alerting the organization’s security team and providing them time to safeguard the real systems.

Ii Background and Related Work

Ii-a Background

As of 2021, 4.66 billion people worldwide are active Internet users[internet] and 57.7 billion US dollars was spent worldwide on cyber-security[stats]. Organizations and industries provide their services online and have extensive networks to handle their user base, attracting cyber attacks. The organizations are also trying to defend their infrastructure and evolve to prevent and mitigate these cyber attacks. Several commercial and open-source SIEM[siemsolutions], EDR[edrsolutions], TIP[tipsolutions] and ITSM[itsmsolutions] systems along with IDS, IPS prevent these attacks. However, these systems have their limitations and fail to detect novel attacks, resulting in attackers penetrating the network[siemfail][edrfail][tipfail][itsmfail]

. These defensive systems need reconfiguration every time there is a novel/zero-day attack. For example, the rules in the firewall, IDS, IPS need to be updated, and the machine learning models need retraining. For this, we need continuous monitoring and uninterrupted threat intelligence, which is organization-specific. SOAR (Security Orchestration, Automation, and Response) systems help minimize these risks as they are a collection of security software solutions and tools for browsing and collecting data from various sources. As per Gartner, SOAR systems enable organizations to collect inputs monitored by the security operations team. Deception Technology is a critical part of the cyber security infrastructure, as it helps organizations confront attackers. Honeypots are one of the core components of deception technology, and a considerable number of honeypots have been developed over the last 20 years. Honeypots gather threat intel about the attacks and trap the attacker into attacking fake systems, shielding existing systems. Honeypots are the tools to deceive an attacker, and it does that by acting as one of the attacker’s targets. Luring an attacker is one side of the coin but engaging it for as long as possible is the other side. Orchestrating the honeypots as per the need ensures the honeypots’ credibility and saves resources and time. There has been much work in honeypot development, so there is little room for more novel honeypot development. However, there has been hardly any work on intelligently deploying honeypots in the network. Most of the time, honeypots are built and deployed on a system in the network 24 hours a day and seven days a week, which results in wastage of resources if the honeypots remain unused and the identification of honeypots becomes easy. Following are the datasets that we used:

Ii-A1 HTTP Honeypot Host-Based IDS

We have developed machine-learning models to classify attacks in the HTTP honeypots that are dynamically deployed by the SOAR Engine. The HTTP Host-Based Intrusion Detection System classifies three HTTP attacks, namely XSS

[OWASP], SQLi[OWASP], and OSC[OWASP]. For training, we used ECML/PKDD 2007 Dataset[DS1] and HTTP CSIC Torpeda 2012 Dataset[DS2] and combined them to form a large dataset. The breakdown of the dataset in terms of attack and normal data points:

S No. Attack Type Total Requests Attack Requests Benign Requests Training Set Test Set

1.
XSS 49761 6573 43188 37320 12441
2. SQLi 81850 38662 43188 61387 20463
3. OSC 45490 2302 43188 34117 11373

TABLE I: Dataset Breakdown for HTTP IDS

Ii-A2 Botnet Detection

We have developed machine-learning models to classify traffic as a botnet or regular traffic as a part of the SOAR Engine. For training, we used the CTU-13 dataset[GARCIA2014100] captured by CTU University, Czech Republic,2011. The purpose of creating the dataset was to capture botnet traffic mixed with regular traffic.

S No. Data-point Tag Data-points(No.) Training Points Test Points
1 Botnet 329183 247077 82106
2 Normal 681495 510931 170564
3 Total 1010678 758008 252670


TABLE II: Dataset breakdown for Botnet Detection

Ii-A3 DDOS Detection

We have developed machine-learning models to classify traffic as DDOS or regular traffic as a part of the SOAR Engine. For training, we used the CIC-IDS 2017 dataset [10.1007/978-3-030-25109-3_9]. The dataset was captured for a week, and each day captured different attacks. The DDOS attack was captured on Wednesday, so for training the model, the PCAP files of Wednesday have been used.

S No. Data Tags Data-points(No.) Training Points Testing Points
1 DDOS Attack 656994 349639 307355
2 Normal 1343006 1045737 290664
3 Total 2000000 1395376 598019
TABLE III: Dataset breakdown for DDoS Detection

Ii-B Related Work

According to our knowledge, there is no prior work reflecting this idea of orchestrating honeypots, some came closest to the concept. This[idsarticle] paper proposed to identify the unused IPs in the network and assign them to the honeypots. They also proposed to use low interaction honeypots, then redirect the incoming packet to the low interaction honeypot, to send them to the high interaction honeypot to get a valid response. This[resularticle] paper proposed the deployment of Honeypots by implementing a graphical user interface to create and destroy a honeypot manually. Another [unknown] paper proposed to deploy honeypot when a decided alarm is triggered; they proposed to use docker for honeypot deployment. Many commercial software solutions [commercialdeception] claim dynamic deception. However, the working of these solutions is not open-source, and no data validating it could be found. The limitations and research gap lies in the fact that there are no methods to dynamically orchestrate a honeypot to ensure that the honeypot’s detection is delayed for the maximum time and automate the deployment technique in line with the attacker’s interest in the organization.

Iii Problem Statement

Let us consider an organization with an internal network where several devices are connected to the internet and several air-gapped networks that host several critical applications, protected using an amalgamation of SIEM, EDR, TIP, and ITSM. As discussed in the previous sections, these systems do not guarantee the security of a network and fail to detect novel/zero-day attacks and insider threats[insiderattacksexp][databreach60]. Air-gapped networks remain vulnerable to attacks as physically detaching the network from the internet does not guarantee its security, as seen in various cases[airgapattack2]. To mitigate these problems, we need to create a deception defense mechanism that engages the attacker longer and provides uninterrupted organization-specific threat intelligence. We propose a SOAR Engine that:

  1. Intelligently deploys and deletes honeypots in one or multiple networks, hence attracting more attackers.

  2. Saves CPU time and increases attacker engagement time in the honeypots.

  3. Implement a DDOS attack detection, botnet detection, and malware collection system that monitors the honeypot network for any malicious activity and reports it.

The SOAR engine adds an extra layer of deception in the internal network along with the existing techniques already in place. The engine also helps save resources and intelligent deception, securing the organization’s internal network. With the integration of our novel SOAR engine in deception, organizational networks can have an extra layer of security even if the attacker has infiltrated inside the network undetected.

Iv Proposed Methodology

We have developed an SOAR Engine that will deploy honeypots according to attacker behavior inside the network. A few IPs inside the network is reserved and used to deploy honeypots. These reserved IPs would not be known to genuine users. The traffic in these IPs is continuously monitored, and whenever there is any incoming traffic to these IPs, it can be pondered that the traffic has malicious intent. The engine would then orchestrate the honeypots as per the traffic and use the response from the attacker to orchestrate the honeypots even further. Compared with real-life war scenarios, these reserved IPs act as mines for the attacker. This SOAR engine can be considered the last line of defense for an organizational network. The SOAR Engine involves many components. It is an ensemble of different technologies to develop a system that can be fruitful for organizations to protect their network. This section describes the architecture of the SOAR engine designed and its interaction with the rest of the components.

Iv-a System Design

Fig. 1: Components of SOAR Engine
  1. Host Machine: Hosts every component.

  2. Virtual Machine: Hosts the honeypots, used to connect multiple VLANs.

  3. Honeypots: Base of the SOAR Engine. Used to trap attackers in the VLAN networks.

  4. Container Registry: Server that hosts the honeypot images.

  5. Storage: Stores logs, backups and files for the SOAR Engine.

  6. Traffic Tracker: Monitors the incoming and outgoing traffic to the reserved IPs in the virtual machine.

  7. Botnet Detector: Detects Botnet traffic to the reserved IPs.

  8. DDOS Detector: Detects DDOS traffic to the reserved IPs.

  9. Orchestration Tool: Orchestrates and automates the other components so that no human involvement is required except for starting the engine. The traffic tracker component captures all the incoming traffic on the host, and based on that; the orchestration tool decides to start the honeypot. Initially, no honeypot is deployed, suggesting no attackers present inside the network.As soon as traffic to the reserved IPs is detected, employing an attacker’s presence, the orchestration tool deploys respective honeypots based on the attacker’s interaction. As the attacker interacts with the honeypots, it deploys the next honeypots based on their interaction with the previous honeypots to ensure more extended engagement with the honeypots.The orchestration tool uses a hybrid approach (ML and Rule-Based) for decision-making.

  10. Access Logs: Generates logs in every component of the SOAR engine, which is then stored in the storage component.

Fig. 2: SOAR Engine Architecture

Iv-B Implementation

The strength of a honeypot depends on its design and deployment techniques. As we are inviting a series of deadly attacks on the honeypot systems, hence it should be made sure that the honeypots work reliably. This SOAR engine implements a dynamic honeypot deployment technique intelligently using a set of rules and decisions from machine-learning models.

Iv-B1 Virtual Machine Setup

A virtual machine is created with Virtualbox with the operating system as Ubuntu Server 18.04 LTS where open-ssh server and docker are installed. By GUI, VirtualBox allows connecting four adapters to the physical ethernet ports of the machine. In order to add the other four adapters, it has been done through the command line using VirtualBox’s own set of commands. The VM has eight adapters, seven of them connected to the network in VLAN. One adapter set to host-only transfers logs/files from VM to host. The virtual machine’s VirtualBox image(ova file) exported is used as a base image.

Iv-B2 Network Setup

The SOAR Engine deployed on a VLAN with seven IPs reserved for deploying honeypots, with the static IPs being on equal intervals from each other distributed throughout the sub-net. For example, a VLAN having subnet 172.26.233.0, the first IP is reserved at 172.26.233.4, the second IP reserved at 172.26.233.40, the third IP reserved at 172.26.233.85, the fourth IP reserved at 172.26.233.125, the fifth IP reserved at 172.26.233.185, the sixth IP reserved at 172.26.233.220, and the seventh IP reserved at 172.26.233.250. In this way, all the seven IPs are reserved and distributed throughout the sub-net. DHCP does not allocate these IPs to any genuine user/services in the VLAN.

Iv-B3 Working of the SOAR Engine

On initialization, the SOAR Engine receives input of the number of virtual machines to start, the RAM size of each virtual machine, the number of CPU cores to allocate, whether it should run the previously stopped instances or start fresh, and the host’s network interface through which the virtual machine should be connected.

Working of the SOAR Engine in chronological order:

  1. Start the container registry, orchestration tool, DDOS detector, and Botnet Detector on different threads per the inputs received.

  2. Transfer all the honeypot images required for the VLAN sub-net from the container registry to the VM.

  3. Analyze the incoming packets in the physical ethernet card (NIC) connected to the VLAN while checking some attributes:

    1. If the destination IP of the packet belongs to any static IPs assigned to deploy honeypots.

    2. If the destination port is of some application running in the VLAN. For example, if an HTTP server runs on VLAN, there would be an HTTP honeypot in the container registry. So, the SOAR engine will look for port 80 being pinged/accessed in the reserved IPs.

    3. If both the above points are valid; it might be the case that someone is trying to scan the network; it might be that some malicious insider/ attacker is trying to scan the network. The SOAR Engine will trigger to deploy the honeypot for the service that uses the port that came as the packet’s destination port. For example, if the packet’s destination port is 80, the SOAR Engine will deploy an HTTP honeypot in the next reserved IP. If there is already an HTTP honeypot running in the next reserved IP, the engine will update the latest incoming traffic to the current time, meaning that the honeypot is not idle. Idle honeypots are deleted after being idle for some time.

  4. The SOAR Engine will analyze the host-based logs of the deployed honeypots running. For example, if in a VLAN of 172.26.233.0 – 172.26.233.4, 172.26.233.40, 172.26.233.85, 172.26.233.125, 172.26.233.185, 172.26.233.220, 172.26.233.250 are the reserved IPs, and there is an HTTP honeypot deployed at 172.26.233.4. The orchestration tool will monitor and analyze the logs of the HTTP honeypot and will deploy respective honeypots in the following IP. If there is an SQL Injection attack detected in the HTTP honeypot at 172.26.233.4, then an SQL Injection Honeypot would be deployed at 172.26.233.40. In this way, the attacker would be engaged in attacking the reserved IPs, which act as mines, alerting the organization’s security team and provide more time to protect existing infrastructure.

Let:
Packet=Incoming packet
IP_dst= IP on which packet will go
IP_port= Destination port of incoming packet IP_honeypot= where j is a fixed interval and x is an IP
Ports=
Honeypots= x is a application and x is exposed on p and
Assumptions:
All the VMs are started and configured with IP_honeypot and with host-only IP
for each Packet do
       if  then
             if   then
                   if  hp such that and  then
                         IP get the IP for deploying honeypot
HostIP get the host IP on which IP is assigned
Deploy Honeypot on IP if not already deployed
Start supporting modules in a separate thread
Result: Honeypot is deployed
Algorithm 1 Algorithm for honeypot deployment by SOAR Engine

Among the seven reserved IPs used to deploy the honeypots; there must be a way to deploy the honeypots among the IPs according to the attacker’s behavior. During the reconnaissance phase, an attacker scans the network to gather information about the services in the systems deployed/running. While we deploy honeypots, it might be the case that the attacker has already scanned the IP where we have deployed the honeypot, following which the deployment would be of no use. We have developed an algorithm that will deploy the honeypots per the attacker’s behavior. There is no redundancy, and honeypots are not deployed in the reserved IPs already scanned. Algorithm 2 shows the IP allocation algorithm we have developed.

Let:
IP_dst= IP on which packet came
IP_honeypot= where j is a fixed interval and x is an IP and i a constant showing first IP
IP_assigned= IP_honeypot and no honeypot is deployed in
IP_no= Number of IP to return
if IP_dst is first IP of IP_honeypot then
       IP_list IP from IP_honeypot except IP_dst
       IP_list IP_list-IP_assigned
       return first IP_no number of IP from IP_list
else
       IP_list IP from IP_honeypot except IP_dst
       IP_list IP_list-IP_assigned
       return last IP_no number of IP from IP_list
Result: Reserved IPs that are free currently
Algorithm 2 Algorithm for IP selection to deploy honeypots in reserved IPs

The honeypots deployed are continuously checked for new files added. The new files are sent to the storage component repository, from where it is sent for further analysis. The honeypots are also checked for being active. The metadata of deployed honeypot with the timestamp of the last incoming packet is checked with the time the honeypot was deployed. If the honeypot has no activity for the last 15 minutes, a backup of the honeypot image is stored, and then it is deleted.

Fig. 3: Execution Chart of the SOAR Engine

Since our primary goal is to deploy honeypots intelligently, we used modified versions of some open-source honeypots. All of them have been dockerized to ensure they are lightweight and could restart quickly in case of any failure. The honeypots we used are:

  1. Web Server — apache/httpd — Apache Server Docker used as high interaction HTTP honeypot.

  2. Application Server — Tomcat — Tomcat Server docker for high interaction HTTP application server honeypot.

  3. Database Server — MySQL — MySQL Server version 5.7 docker for high interaction database server honeypot.

  4. SSH — Cowrie — Medium to high interaction open-source honeypot

  5. SMTP — Mailoney — Open-Source SMTP Honeypot written in python

  6. Modbus — Custom Modbus Server — Modbus docker image used for high interaction modbus honeypot.

Iv-B4 HTTP IDS, Botnet, and DDoS Detection

In a real-world scenario, several applications interact to make a proper solution. The HTTP honeypot deployed is designed to display a page that takes input from the user and stores it in a database. Hence, the Tomcat Server, HTTP Server, and the database server are used. There are also three other HTTP honeypots with specific vulnerabilities in the container registry: SQL Injection (effects database server)[OWASP], Cross-Site Scripting (effects application server)[OWASP], and OS Command Injection (effects application server)[OWASP].
We modified the IDS developed by Bhagwani Et Al.[10.1007/978-3-030-35869-3_10] to predict SQLi, XSS, and OSC attacks on HTTP servers. The SOAR Engine uses these machine learning models to detect attacks in the logs of the initial HTTP honeypot deployed and then deploy HTTP honeypots in other IPs with specific vulnerabilities like SQLi, XSS, and OSC. Different Machine Learning models are implemented and cross validated, and based on the result of the cross validation set the models that gave the best results are selected.

Http Ids

Features: All the attributes in the features are frequency-based. Hence we deleted some features since the sum of the frequency was zero. Table IV shows the features that we deleted:

S No Attack Features
1 XSS ’ !’,’’,’¡¿’,’[]’,’createelement’, ’search’, ’eval()’ ,’string.fromcharcode’
2 SQLi ’-’,”/**/”,”’”, ’;’, ’#’, ’[’, ’]’, ’(’, ’)’, ’’, ’—’, ’¡¿’, ’¡=’, ’¿=’, ’&&’, ’——’, ’:’,’ !=’,’()’,
3 OSC ’..\\’, ’\\.’, ’\\/’,’:/’,’etc/passwd’, ’‘’
TABLE IV: Excluded features from the referenced attributes

Detection Models: There was a case of data imbalance for XSS and OSC. The class_weight attribute of each algorithm was used, which deals with the imbalance. In some cases, the accuracy of our models was better than the referenced paper’s. In table V,VI,VII row 1,3,5 shows the accuracy computed by our model, whereas rows 2,4,6 show accuracy computed by the referenced paper.

S No. Classifier Accuracy Precision Recall F-Score
1 Decision Tree 98.81 98.83 98.81 98.79
2 Decision Tree 99.27 98.4 97.55 97.98
3 SVM 98.191 98.2 98.19 98.14
4 SVM 98.66 98.32 98.88 98.6
5 LR 98.47 98.49 98.47 98.43
6 LR 98.04 98.4 97.55 97.98
TABLE V: ML models and accuracy for XSS (in %)
S No Classifier Accuracy Precision Recall F-Score
1 Decision Tree 99.06 99.08 99.06 99.06
2 Decision Tree 96.78 95.83 97.95 96.88
3 SVM 98.12 98.14 98.12 98.12
4 SVM 95.57 95 96.44 95.71
5 LR 98.41 98.42 98.41 98.41
6 LR 95.7 95.16 96.53 95.84
TABLE VI: ML models and accuracy for SQLi (in %)
S No Classifier Accuracy Precision Recall F-Score
1 Decision Tree 98.57 98.58 98.57 98.46
2 Decision Tree 97.29 98.61 94.32 96.42
3 SVM 98.02 97.99 98.02 97.81
4 SVM 97.95 99.4 96.36 97.85
5 LR 98.44 98.46 98.44 98.31
6 LR 97.85 99.01 96.53 97.75
TABLE VII: ML models and accuracy for OSC (in %)
Botnet Detection

The botnet detection module is responsible for getting net flows in the network and classify as botnet flow or standard flow. Argus server is started, which collects the net flows for one minute, and the total flow during this interval is classified as a botnet or normal flow. If botnet flow is detected, the IP address of the source and destination IP is extracted and notified.

Netflow Format: The argus server creates this net flow file after being fed with the attributes required to detect botnet traffic. A flow is defined by the IP pair and port number pair.

Features: The features used for botnet traffic classification inspired by this reference[git1], we modified to fit our requirement of collecting the traffic flow for a minute. The features and description of the features used are: Duration for the transfer of packet, protocol used, source port, destination port, source IP, destination IP, total bytes transferred, state of the netflow entry and total packets transferred. If duration of packet or byte transfer is greater than 1 minute, then it will be taken as total packet/byte per minute.

Detection Models: Our Botnet Detection technique provides better accuracy than existing methods[9299061]. The highest accuracy in the referenced method is 99.89%, whereas the highest accuracy achieved by our method is 99.95%.

S No Classifier Accuracy Precision Recall F-Score
1 Decision Tree 99.95 99.95 99.95 99.95
2 SVM 97.55 97.68 97.55 97.57
TABLE VIII: ML models and Accuracy of Botnet detection model (in %)
DDoS Detection

The DDOS detection module classifies each packet into a DDOS packet or a regular packet. The source and destination IP are extracted and notified if a DDOS packet is detected.

Features: Features inspired by this reference[git2], we have added features like protocol used, source port, destination port, and packet length. The reason behind adding the features are:

  1. Protocol: Protocols like UDP, TCP ICMP are used to attack services.

  2. Port: A port signifies which service is getting attacked.

  3. Packet Length: There might be cases where the packets may or may not have any data.

Features used are: Ethernet Source Occurrence-look back 100, Ethernet Destination Occurrence-look back 100, IP Source Occurrence-look back 100, IP Destination Occurrence-look back 100, Ethernet Source Occurrence-look back 1000, Ethernet Destination Occurrence-look back 1000, IP Source Occurrence-look back 1000, IP Destination Occurrence-look back 1000, Timestamp 1: Current packet time - Previous packet time, Timestamp 2: Current packet time - packet’s time, Timestamp 3: Current packet time - packet’s time, Timestamp 4: Current packet time - packet’s time, Protocol used i.e UDP,TCP or ICMP, Source Port: Source port in packet, Destination port in packet, Length of packet

Occurrence look back ’n’ means occurrence of IP address in last ’n’ packets.

Detection Models: Our DDOS Detection technique provides better accuracy than existing methods[proceedings2020063051]. The highest accuracy in the referenced method is 97.86%, whereas the highest accuracy achieved by our method is 99.94%.

S No. Classifier Accuracy Precision Recall F-Score
1 Decision Tree 99.92 99.94 99.85 99.93
2 SVM 99.92 99.92 99.92 99.92
3 LR 99.91 99.91 99.91 99.91
TABLE IX: ML models and Accuracy of DDOS detection model (in %)

V Validation

V-a Experimental Setup

The SOAR Engine developed was tested in a real network to validate that the system works and is not only in theory. An internal network environment configured to ensure real-life setup and attack scenarios. We used a 32 core processor with a 64 GB RAM machine hosted on DigitalOcean. Since the SOAR engine mainly focuses on deploying honeypots in internal networks, we took advantage of an ongoing Capture The Flag event hosted by the TalentSprint-IIT Kanpur advance certification program. The event involved 66 players playing the whole event. Each player was given a machine inside the network with a unique username and password. The machines with unique IPs given to the players were docker images deployed on the host machine. In between IPs given to players, some IPs were reserved to be used for honeypot deployment by the SOAR Engine. Also, there were different HTTP, FTP, Modbus, SMTP servers running in some IPs in the network. Each player can be considered an attacker who has compromised an internal machine and tried to attack the network. The players were given a problem statement where they had to find some flags hidden in files inside some machines in the network. They had to act as attackers trying to attack the whole network and find critical data to solve it. This setup ran for four days, and the generated results proved the practical implementation of the SOAR Engine, which was pretty efficient.

V-B Results

V-B1 Number of times honeypots deployed:

One of the essential features of the SOAR Engine is to deploy a honeypot when required and to delete it after it is idle for a long time. This technique helps save resources in the machine where the SOAR Engine is running. The statistics of the Honeypots deployed by the engine are:

  1. Apache – Deployed and deleted thirteen times

  2. SSH – Deployed and deleted ten times

  3. SMTP – Deployed and deleted ten times

  4. Modbus – Deployed and deleted once

  5. XSS, OSC – Deployed and deleted three times

  6. SQLi – Deployed and deleted three times

The Modbus Honeypot was deployed once in four days. Deploying it statically would have led to wasting of resources.

Fig. 4: Number of times Honeypots were deployed
Fig. 5: CPU Usage Time of SOAR Engine vs Statically Deployed Honeypots

V-B2 Number of Attacks on the Honeypots:

In the Capture The Flag event, one of the problem statements required targeting a web server hosting a website. Hence, attacks in the web-based honeypot are much more. However, attacks were also detected in the SSH, MODBUS, and SMTP honeypots. Another problem statement in the CTF required every player to visit a machine assigned to them for solving problems and finding flags. The players also had to act as attackers and visit other machines in the network to find flags and other important intelligence. During this exercise, in addition to finding flags, if they visited another player’s machine and attacked it, the attacker was given a positive point, and the victim was given a negative point. If they visited the honeypots and interacted, they were given a negative point. Since there were 66 players and each had a web server running in their assigned machine, every web server can be considered a static honeypot. In the SOAR Engine, the HTTP protocol(web-server honeypot) had two different honeypot images, and on each instance of attack, alternate honeypot images were deployed and deleted(when idle) automatically. It was seen that 66 static web-server honeypots collected 63108 attacks, whereas two web-server honeypot images deployed dynamically collected 7555 attacks. Hence, when the honeypots are deployed statically, each collected 956 attacks, but when deployed dynamically, each collected around 3777 attacks, with the additional benefit of resource-saving when there are no attacks.

Fig. 6: Number of Attacks per honeypot image
Fig. 7: Number of Attacks of each type per honeypot image

V-B3 Comparison of attacks in statically deployed honeypots v/s SOAR Engine deployed honeypots:

We also deployed static honeypots (of the same protocols used in SOAR Engine) in DigitalOcean to compare the engagement time of the attackers with our SOAR Engine. The honeypots were flagged as honeypots within 30 minutes of deployment, along with the attacker interaction and engagement declining within time. Table X shows the comparison of attacker engagement time of our SOAR Engine with that of honeypots deployed statically.

The results show that statically deployed honeypots attracted fewer attackers and used too many resources than the SOAR Engine deployed honeypots. Hence, it can be established that honeypots deployed by our SOAR engine increase the attacker engagement time and save resources, thereby increasing the efficiency of honeypot deployment. Since these honeypots are deployed only in an attacker’s presence, they can be established as behavioural honeypots.

SOAR Engine Deployed Honeypots Statically Deployed Honeypots
Attacker IP Time Attacker IP Time
172.16.238.5 5977 sec 20.55.53.144 321 sec
172.16.238.24 4390 sec 104.155.181.214 263 sec
172.16.238.58 4348 sec 162.158.167.231 165 sec
172.16.238.8 3964 sec 106.208.155.125 75 sec
172.16.238.5 3413 sec 106.208.154.240 55 sec
172.16.238.30 2069 sec 142.93.157.218 50 sec
172.16.238.24 1987 sec 162.158.165.53 35 sec
172.16.238.40 1961 sec 52.136.124.138 29 sec
172.16.238.5 1871 sec 45.146.164.110 17 sec
172.16.238.5 1509 sec 45.146.164.110 17 sec
TABLE X: Top 10 Engagement Times in SOAR Engine vs Statically Deployed Honeypots

V-B4 DDOS, Botnet Attacks, and Malware Collection:

The SOAR Engine captured a total of 965 DDOS packets in the network. There was no Botnet positive flow detected in the network during the experiment. Three malicious samples were also captured by the honeypots deployed by the SOAR Engine. The samples were shell files that tried to:

  1. Delete the filesystem by using the command rm -rf

  2. Tried to view the /etc/passwd file

  3. Tried to view SSH logs at /var/log/auth.log

V-B5 Ready-to-response time of the SOAR Engine

Selection of IP and Services w.r.t Lateral Movement:

Our IP selection algorithm selects the IP based on the first IP that the attacker tries to recon. It might be the case that while the honeypot is being deployed, the attacker’s scan passes through the IP where the honeypot is to be deployed. The SOAR Engine’s ready-to-response time does not allow this. If the engine is deployed in a machine that has two cores and 4 GB of RAM, it takes around 6 seconds to deploy a honeypot after detecting recon activity on the first IP. A simple network scan where the first thousand ports of each system are scanned takes around 1.5 seconds per system. Since the IPs are allocated at a fixed distance, in our experiment, 20 IPs apart, it gives around 30 seconds before the attacker’s recon activity reaches the honeypot. As the system’s resources where the SOAR Engine is deployed increase, the engine’s performance also increases simultaneously.

Vi Conclusion

Attacks on organizations are increasing daily, and the type of attacks are organization-specific. Deception Technology has been used for many years to collect threat intelligence about the types of attacks and to deceive the attackers from the original target. Static honeypots do not get engaged enough to know the attacker’s capability and modus operandi. Hence, we proposed a Security Orchestration, Automation, and Response Engine to dynamically deploy honeypots as per the attacker’s behavior, save resources, and increase the attacker’s engagement in the honeypots. It also implemented a botnet and DDOS detection tool for the honeypot network and a malware storage system. The orchestration uses both rule-based and machine-learning techniques for the task. After deploying the whole system in a live environment, we saw that the SOAR Engine deployed honeypots provides a far better attacker engagement inside the honeypots and save around 89% of the CPU time in the machine deployed. Three malicious samples were collected by the honeypots and DDOS traffic was also detected. These data and experiments validate our claims that the SOAR Engine performs better than existing systems and can be used by organizations to protect their internal networks.

Vii Future Work

Honeypots are one of the best information-gathering and threat intel-gathering systems. This paper deals with the idea that if honeypots are deployed intelligently, it can help protect the network infrastructure from being attacked and alert the security team to take action early. The IP selection algorithm can be improved based on the incoming network and port scans. Also, in cases where insider attackers are trying to compromise the critical data in the network have less chance of being detected. The honeypot bank must be increased, and high-interaction honeypots of various honeypots can be developed. Also, attackers may perform lateral movement where they can hop from one IP to the other, so there has to be some way developed to provide a much more guided way to ensure better engagement of the attacker.

References