Background: Domain Name System (DNS) is considered the phone book of the Internet.
Its main goal is to translate a domain name to an IP address that the computer can understand. However,
DNS can be vulnerable to various kinds of attacks, such as DNS poisoning attacks and DNS
Objective: The main objective of this paper was to allow researchers to identify DNS tunnel traffic using
machine-learning algorithms. Training machine-learning algorithms to detect DNS tunnel traffic
and determine which protocol was used will help the community to speed up the process of detecting
Methods: In this paper, we considered the DNS tunneling attack. In addition, we discussed how attackers
can exploit this protocol to infiltrate data breaches from the network. The attack starts by encoding
data inside the DNS queries to the outside of the network. The malicious DNS server will receive
a small chunk of data decoding the payload and put it together at the server. The main concern is
that the DNS is a fundamental service that is not usually blocked by a firewall and receives less attention
from systems administrators due to a vast amount of traffic.
Results: This paper investigates how this type of attack happens using the DNS tunneling tool by setting
up an environment consisting of compromised DNS servers and compromised hosts with the Iodine
tool installed in both machines. The generated dataset contains the traffic of HTTP, HTTPS,
SSH, SFTP, and POP3 protocols over the DNS. No features were removed from the dataset so that researchers
could utilize all features in the dataset.
Conclusion: DNS tunneling remains a critical attack that needs more attention to address. DNS tunneled
environment allows us to understand how such an attack happens. We built the appropriate dataset
by simulating various attack scenarios using different protocols. The created dataset contains
PCAP, JSON, and CSV files to allow researchers to use different methods to detect tunnel traffic.