Can botnet hunting systems work better without our advice?

Add to my custom PDF

Botnet behaviour analysis: How would a data analytics-based system with minimum a priori information perform?

Botnets are an evolving problem. A botnet is a network of computers that have been infected with malware allowing them to be controlled without their owners’ permission. The systems and methods that botnets use to operate change continuously; making them hard to detect. However, for practical purposes they must have an automated updating process. This provides a potential method for detecting botnets through their communications. Innovation in botnet operation make it difficult to identify new botnets based on knowledge of past botnets. Developing long-term solutions requires an understanding of what makes a botnet detector effective.

Haddadi and Zincir-Heywood compared five botnet detection systems. The systems chosen used different methods to detect malware. They used regulaly updated sets of rules, analysed message content, or analysed content extracted from the flow of communications traffic. Four of the detection methods were based on knowledge of existing malware.

The researchers came up with comparative tests for five systems: Snort, BotHunter, Trananlyzer-2, FlowAF and a packet payload-based system.

Snort

Intrusion detection and prevention system that matches data packets to predefined signatures (rule sets) based on a priori knowledge. Publicly available.

BotHunter

Based on the idea that all botnet infection processes are similar and so it detects packets related specific bot actions at different stages of its life to better detect infected machines. Publicly available.

FlowAF

Uses machine learning algorithms to classify traffic as originating from botnets traffic based on the time between packets, or the flow intervals.

Trananalyser-2

Uses a machine learning algorithm to identify malware based on the features of the flow of packets, rather than analyzing individual packets. The system chooses features that provide the most insight itself rather than using ones selected by an expert.

Packet payload system

Attempts to classify as matching known types of malware packets based on their features, such as the port, protocol, or size of the packets.

Snort, BotHunter, the packet payload–based system, and FlowAF are systems that use expert knowledge of malware to define the rules and features for detection. The Tranalyzer-2 flow-based system used a minimum of established knowledge to extract a wide range of defining features for malware. The researchers tested the 5 systems by applying them on 25 publicly available malware collections. The results suggested that the Tranalyzer-2 flow-based and FlowAF systems outperformed the other systems. Using specific features to detect malware could limit how well a system works when faced with new threats. It makes sense that it should be able to change how it detects malware to suit what is happening in its environment. Classification methods that can detect unusual traffic behaviour without having to rely on predefined knowledge would be better at adapting over time. It seems that most useful features for malware detection are those related to the communications traffic flow. More particularly, the space between the arrival of data packets appears to be important. This implies that the flow of traffic in botnets, even decentralized botnets, is different enough from normal user behaviour to be detectable.

Relying on predefined features of botnets or malware places detection software at a disadvantage. Malware changes and evolves constantly, meaning that past experience has limited benefit in new scenarios. Malware detection that augments predefined knowledge but not relying on that knowledge can assist it to detect unknown botnets.

Malware discovering systems augmented with detection that is based on discovery rather than expert opinion could be more effective in detecting unknown botnets.