Drive-by download refers to attacks that automatically download malwares to user’s computer without his knowledge or consent. This type of attack is accomplished by exploiting web browsers and plugins vulnerabilities. The damage may include data leakage leading to financial loss. Traditional antivirus and intrusion detection systems are not efficient against such attacks. Researchers proposed plenty of detection approaches mostly passive blacklisting. However, a few proposed dynamic classification techniques, which suffer from clear shortcomings. In this paper, we propose a novel approach to detect drive-by download infected web pages based on extracted features from their source code. We test 23 different machine learning classifiers using data set of 5435 webpages and based on the detection accuracy we selected the top five to build our detection model. The approach is expected to serve as a base for implementing and developing anti drive-by download programs. We develop a graphical user interface program to allow the end user to examine the URL before visiting the website. The Bagged Trees classifier exhibited the highest accuracy of 90.1% and reported 96.24% true positive and 26.07% false positive rate.
Musaab Hasan, Zayed Balbahaith, Monther Aldwairidbd
Cite This Paper
Aldwairi, M., Hasan, M., & Balbahaith, Z. (2017). Detection of drive-by download attacks using machine learning approach. International Journal of Information Security and Privacy (IJISP), 11(4), 16-28.