Метод та засоби ідентифікації бот-мереж, що використовують технологію «потік доменів» – Вісник Хмельницького національного університету

МЕТОД ТА ЗАСОБИ ІДЕНТИФІКАЦІЇ БОТ-МЕРЕЖ, ЩО ВИКОРИСТОВУЮТЬ ТЕХНОЛОГІЮ «ПОТІК ДОМЕНІВ»

METHOD AND SOFTWARE FOR DOMAIN-FLUX BOTNET IDENTIFICATION

Сторінки: 34-42. Номер: №3, 2020 (285)
Автори:
С. М. ЛИСЕНКО, В. І. КОМАРОВ
Хмельницький національний університет
S. LYSENKO, V. KOMAROV
Khmelnytskyi National University
DOI: https://www.doi.org/10.31891/2307-5732-2020-285-3-6
Рецензія/Peer review : 04.05.2020р.
Надрукована/Printed : 01.06.2020р.

Анотація мовою оригіналу

В роботі представлено метод ідентифікації бот-мереж, що використовують технологію «потік доменів». Метод дозволяє виявляти як відомі, так і нові невідомі раніше загрози на основі комплексного аналізу DNS-трафіку. Даний метод поєднує в собі опрацювання збоїв у DNS-запитах, використання частотного лексичного аналізу доменних імен та аналіз множини ознак отриманих з DNS-повідомлень за допомогою алгоритму машинного навчання Random Forest, що дозволяє підвищити ефективність та достовірність виявлення даного типу бот-мереж, а також дає змогу виявляти атаки на ранніх стадіях або навіть до їх виникнення. Запропонований метод може бути основою для побудови програмного забезпечення систем виявлення бот-мереж, що використовують технологію «потік доменів».
Ключові слова: бот-мережа, потік доменів, шкідливе програмне забезпечення, Random Forest, DNS.

Розширена анотація англійською мовою

The purpose of this paper is to develop a method for detecting domain-flux botnet. In this paper, we focus on detecting domain‐flux botnets based on Domain Name System (DNS) traffic features. We have explored the peculiarities of the domain-flux botnets and developed a botnet model based on DNS, DNS traffic model and model of the detection process determine all features. This method passively captures all DNS traffic from network and then extract all useful data from each DNS message. This method combines handling DNS query failures, the use of frequency domain lexical analysis and the analysis of multiple features derived from DNS messages using the Random Forest machine learning algorithm. We have analyzed a large number of legitimate domains and pseudo‐random domain names generated by different domain-flux botnets to get expected values for domain names generated by humans and bots. In addition, this method use white list database to filter known domain names queries. The method allows to identify both known and new previously unknown threats based on a comprehensive analysis of DNS traffic. Comprehensive analysis of DNS traffic improves the efficiency and reliability of the detection of this type of botnets, and allows the detection of attacks in the early stages or even before they occur. In order to evaluate the effectiveness of the proposed approach, Random Forest machine learning algorithm has applied to train predictive model for our detection system. This proposed scheme has implemented and tested in a real local area network. The experimental results show that our proposed method achieves the highest detective efficiency with an average overall true positive rate of up to 96.08% and a false positive rate of 0.8%. In addition, the proposed method can be the basis for the construction of other software systems for detection of domain-flux botnets.
Keywords: botnet, domain-flux, malware, Random Forest, DNS.

References

Dodopoulos R. DNS-based Detection of Malicious Activity: master’s thesis, Eindhoven University of Technology. 2015.
Lysenko S.M. Metody vyiavlennia bot-merezh v kompiuternykh systemakh / S.M. Lysenko, K.Iu.Bobrovnikova, V.S. Kharchenko // Suchasni informatsiini systemy. – 2019. – T. 3. № . – S. 87–95.
Agyepong E., Buchanan W., Jones K. Detection of Algorithmically Generated Malicious Domain Using Frequency Analysis. International Journal of Computer Science and Information Technology. 2018. DOI: 10.5121/ijcsit.2018.10306.
Truong D., Cheng G. Detecting domain‐flux botnet based on DNS traffic features in managed network. Security Comm. Networks. 2016. 9: 2338– 2347. DOI: 10.1002/sec.1495.
Wielogorska M., O’Brien D. DNS Traffic analysis for botnet detection: Proceedings of the 25th Irish Conference on Artificial Intelligence and Cognitive Science (Dublin, December 7 – 8, 2017). Р. 261–271.
Mockapetris P. RFC-1034. Domain names – concepts and facilities ISI, 1987. URL: http://www.ietf.org/rfc/rfc1034.txt?number=1034.
Mockapetris P. RFC-1035. Domain names – concepts and facilities. ISI, 1987. URL: http://www.ietf.org/rfc/rfc1035.txt?number=1035.
Polamuri S. How the Random Forest algorithm works in machine learning. URL: https://dataaspirant.com/2017/05/22/random-forest-algorithm-machine-learing (application date: 13.03.2020).
Ronaghan S. The Mathematics of Decision Trees, Random Forest and Feature Importance in Scikit-learn and Spark. URL: https://towardsdatascience.com/the-mathematics-of-decision-trees-random-forest-and-feature-importance-in-scikit-learn-and-spark-f2861df67e3 (application date: 13.03.2020).
Trost J. Getting Started with DGA Domain Detection Research. URL: http://www.covert.io/getting-started-with-dga-research (application date: 13.03.2020).

Post Author: npetliaks