Classifying DNS Servers based on Response Message Matrix using Machine Learning

by   Keiichi Shima, et al.
The University of Tokyo

Improperly configured domain name system (DNS) servers are sometimes used as packet reflectors as part of a DoS or DDoS attack. Detecting packets created as a result of this activity is logically possible by monitoring the DNS request and response traffic. Any response that does not have a corresponding request can be considered a reflected message; checking and tracking every DNS packet, however, is a non-trivial operation. In this paper, we propose a detection mechanism for DNS servers used as reflectors by using a DNS server feature matrix built from a small number of packets and a machine learning algorithm. The F1 score of bad DNS server detection was more than 0.9 when the test and training data are generated within the same day, and more than 0.7 for the data not used for the training and testing phase of the same day.



There are no comments yet.


page 1

page 2

page 3

page 4


NFSlicer: Data Movement Optimization for Shallow Network Functions

Network Function (NF) deployments on commodity servers have become ubiqu...

Zeroing in on Port 0 Traffic in the Wild

Internet services leverage transport protocol port numbers to specify th...

Moment Generating Function of the AoI in Multi-Source Systems with Computation-Intensive Status Updates

We consider a multi-source status update system in which status updates ...

NXNSAttack: Recursive DNS Inefficiencies and Vulnerabilities

The Domain Name System (DNS) infrastructure, a most critical system the ...

A Glimpse of the Matrix (Extended Version): Scalability Issues of a New Message-Oriented Data Synchronization Middleware

Matrix is a new message-oriented data synchronization middleware, used a...

Mitigating Botnet Attack Using Encapsulated Detection Mechanism (EDM)

Botnet as it is popularly called became fashionable in recent times owin...

Simulation for L3 Volumetric Attack Detection

The detection of a volumetric attack involves collecting statistics on t...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Domain name system (DNS) is one of the most important technologies of the Internet. We can convert a domain name into an IP address using DNS. Without this service, the Internet would not be deployed as widely as it is now. DNS messages are normally built on top of UDP packets. Unlike in TCP, it is easy to forge the source address of UDP packets. As a result, DNS requests with a fake source address can easily be sent to a DNS server. In theory, any DNS server can answer any domain name resolution request; there are no protocol requirements that limit or filter request messages from client nodes. When DNS was invented, malicious activity utilizing DNS servers as packet reflectors was not extensive; however, as the Internet grew, attackers started to use this open operating policy to send traffic to victim nodes by forging DNS message source addresses. To prevent this activity, recent DNS servers have been configured to answer requests originating only from specific client nodes, typically filtered by source IP address. Unfortunately, there are more than a few improperly configured DNS servers in the wild; these are called open resolvers111DNS Scanning Project: The DNS protocol is still one of the major methods for attacking [2][1][4]. In this paper, we propose a method of classifying a DNS server, according to whether or not it is used as a reflector, by monitoring the incoming DNS messages. We collect a series of DNS packets sent from a DNS server and build a feature matrix of the server, assuming that a reflector may have a different packet sequence pattern than that found with a normal DNS server. The preliminary result shows that our method can classify reflectors with an F1 score greater than 0.9 when the test and training data are generated within the same day. The trained model can also classify the data not used for the training and testing phase of the same day with more than 0.7 F1 score.

Ii DNS Server Feature Matrix

The basic idea behind this proposal originates from [5]. [5]

was invented to detect malicious nodes by investigating a series of TCP SYN packets sent from these nodes. TCP SYN packets are collected based on the source IP addresses of the TCP streams and a feature matrix as an image is generated. In the aforementioned study, it was assumed that the images have different shapes that are dependent on the activities of a malicious host, for example, scanning or DoS. The images generated from SYN packets were used as training data of a deep learning network using a CNN algorithm.

We follow a similar process in our proposal. The difference is that we use DNS response packets received from servers as an input for building the feature matrix.

To apply our method, we first create training data. To split the DNS messages into good messages and suspicious messages, we used the mechanism proposed in [3]. We monitor DNS messages at the boundary of an organization’s network and check all request and response messages. If there is a DNS server being used as a reflector, and it is sending unintended response messages, we will not see any matching request messages sent from within the organization.

The values we used to generate a feature matrix are shown in TABLE I.

Type Description
Timestamp Timestamp of a packet
Port # Source port # of a packet
Size Size of a DNS message
OPCODE field Indicating the DNS message type
AA field Indicating Authoritative Answer or not
TC field Indicating if a packet is truncated
RD field Indicating if recursive query is desired
RA field Indicating if recursive query is available
Z field Reserved field and should be 0
RCODE field Indicating result code
QDCOUNT # of query items
ARCOUNT # of answer records
NSCOUNT # of name servers information
AACOUNT # of additional records
TABLE I: Values of DNS message used to build a feature matrix

The captured messages are grouped by source IP address (in this case the DNS server IP address), sorted by timestamp, and divided into groups of 100 packets. Fig. 1 shows an example of a DNS server feature matrix.

Fig. 1: Example of a visualized DNS server feature matrix

The order of rows is the same as the order presented in Table I. The values are normalized per row. Each column indicates one DNS response message. Because feature matrix is created every 100 packets, the size of columns is 100.

Iii Learning with SVM

The feature matrix image shown in Fig. 1 is based on messages sent from a suspicious DNS server. This particular server kept sending unsolicited DNS response messages; we can guess the behavior by observing the image. A smoothly changing timestamp row means that messages are being sent periodically. Most packets have the same shape except for source port number. Rows that are almost white or black signify, in most cases, the same values.

Fig. 2 shows a feature matrix of a good DNS server.

Fig. 2: Example of a feature matrix of a good DNS server

Different from the case shown in Fig. 1, the fields indicating the number of resource records (such as ARCOUNT) in each response packet have several different values. This is plausible because the contents of DNS request messages sent to a specific DNS server vary according to client; responses may also vary, depending on the request messages.

The datasets used with SVM are a single day data of a certain research network captured between 24th August 2019 and 25th August 2019. The sizes of the datasets are listed in TABLE II.

Date # of Good / Bad DNS pkts # of Good / Bad matrices
24th Aug. 33,824,531 / 2,863,321 323,269 / 28,291
25th Aug. 30,238,481 / 1,148,935 291,730 / 6,105
TABLE II: Packet and Matrix counts of datasets

The selected hyper-parameters were penalty = 10, gamma = 0.01, and kernel = rbf, using grid search. The model was trained and tested with 20,000 randomly selected good matrices, and 80% of bad matrices for each day. For example, when using the dataset of 24th, we randomly sampled 20,000 matrices from good matrices and bad matrices. The ratio of training data and test data was 0.8 and 0.2.

TABLE III and IV present the classification results of sampled data for each day. As we can see from the tables, as long as we focus on the sampled data, the classification accuracy is high enough.

Precision Recall F1-score Support
Good 1.00 1.00 1.00 3,987
Bad 1.00 1.00 1.00 4,540
Accuracy 1.00 8,527
Macro Avg. 1.00 1.00 1.00 8,527
Weighted Avg. 1.00 1.00 1.00 8,527
TABLE III: Classification result of sample data of 24th Aug.
Precision Recall F1-score Support
Good 1.00 1.00 1.00 3,993
Bad 0.98 1.00 0.99 984
Accuracy 1.00 4,977
Macro Avg. 0.99 1.00 0.99 4,977
Weighted Avg. 1.00 1.00 1.00 4,977
TABLE IV: Classification result of sample data of 25th Aug.

Next, we evaluated the rest of the data in the datasets not used in the training phase for each day. The results are shown in TABLE V and VI.

Precision Recall F1-score Support
Good 1.00 1.00 1.00 303,269
Bad 0.85 1.00 0.92 5,659
Accuracy 1.00 308,928
Macro Avg. 0.92 1.00 0.96 308,928
Weighted Avg. 1.00 1.00 1.00 308,928
TABLE V: Classification result of unused data of 24th Aug.
Precision Recall F1-score Support
Good 1.00 1.00 1.00 271,730
Bad 0.54 1.00 0.70 1,221
Accuracy 1.00 272,951
Macro Avg. 0.77 1.00 0.85 272,951
Weighted Avg. 1.00 1.00 1.00 272,951
TABLE VI: Classification result of unused data of 25th Aug.

Fig. 3: Example of a feature matrix of an uncertain server

The precision values are decreased on both days that means more false positive results are seen. The F1-score on 24th is still acceptable, however, the score on 25th

is largely degraded. Since the recall values of both good matrices and bad matrices are kept high, we can still detect bad matrices with high enough probability.

Fig 3 shows a feature matrix of a DNS server labeled as a bad matrix which we didn’t see any request messages for the response messages sent from the DNS server. The shape looks quite similar to that of a good feature matrix shown in Fig 2 in the sense that the contents of the response messages have a wide variety of patterns. The server shown in Fig 3 was one of the DNS servers of the host organization of the datasets where we captured the packets. Considering the quality of the security operators of the organization, it is unlikely that the server was used as a reflector. Our guess is that the request messages went to the server using the different path where we were monitoring the traffic.

Cleansing of source data when using machine learning techniques is one of the important phases to achieve reliable results, and at the same time, it is one of the hardest tasks, especially the size of the data is big and the contents are dynamic and changing. Since the Internet is open system and the traffic trends are undoubtedly changing every day, assigning correct labels to training dataset is not an easy task. In this preliminary experiments, we did not perform intensive data cleansing because of lack of time. For example the matrix pattern shown in Fig 3 may be a benign pattern. We continue to investigate the contents of the dataset in more detail to achieve better labels.

Iv Conclusion

We attempted to classify DNS servers according to whether or not they were being used as reflectors by capturing a small number of DNS response messages sent from them. We used a method similar to the one proposed in [5]

to build a DNS server feature matrix. The preliminary results of classification using SVM show sufficient precision as long as training and test data from the same day is used. At this moment, the trained model does not show as high classification result when applied to the rest of the data which are not used for training and testing. One possible reason is the improper labeling of the data. As we described, we labeled each matrix based on the technique described in

[3]. The method can find all the unsolicited DNS response messages assuming we can monitor the entire DNS message exchanges. In our preliminary experiments, we were seeing unsolicited DNS response messages sent from the servers located inside the host organization, which may be benign servers. Assigning correct labels to data is important when using the data as a training dataset for machine learning algorithms. We plan to investigate the contents in more detail to create better training datasets.

The classification method we used in this paper was SVM. SVM is a simple and easy-to-use tool for data analysis, however, we recently have more advanced algorithms. Therefore, in the future, we plan on making the results more stable by investigating data and matrix generation approaches (e.g. what values to use to build a matrix) and also by investigating classification algorithms (including deep learning technologies) to achieve superior performance.


This work was supported by JST CREST Grant Number JPMJCR1783, Japan.


  • [1] K. Alieyan, M. M. Kadhum, M. Anbar, S. U. Rehman, and N. K.A. Alajmi (2016) An overview of DDoS attacks based on DNS. In 2016 International Conference on Information and Communication Technology Convergence, ICTC 2016, pp. 276–280. Cited by: §I.
  • [2] Z. Durumeric, M. Bailey, and J. A. Halderman (2014-08) An Internet-wide view of Internet-wide scanning. In 23rd USENIX Security Symposium (USENIX Security 14), San Diego, CA, pp. 65–78. External Links: ISBN 978-1-931971-15-7, Link Cited by: §I.
  • [3] G. Kambourakis, T. Moschos, D. Geneiatakis, and S. Gritzalis (2007) Detecting DNS amplification attacks. In International Workshop on Critical Information Infrastructures Security, pp. 185–196. Cited by: §II, §IV.
  • [4] M. Kührer, T. Hupperich, J. Bushart, C. Rossow, and T. Holz (2015) Going Wild : Large-Scale Classification of Open DNS Resolvers Categories and Subject Descriptors. In Proceedings of the 2015 Internet Measurement Conference (IMC’15), pp. 355–368. External Links: ISBN 9781450338486 Cited by: §I.
  • [5] R. Nakamura, Y. Sekiya, D. Miyamoto, K. Okada, and T. Ishihara (2018)

    Malicious host detection by imaging SYN packets and a neural network

    In Proceedings of IEEE International Symposium on Networks, Computers and Commnications (ISNCC2018), Cited by: §II, §IV.