ASK HERE

10-08-2011, 01:57 PM

Design and Implementation of an Anomaly Detection Scheme in Sachet Intrusion Detection System
Due to the widespread proliferation of computer networks, attacks on computer systems are increasing day by day. Preventive measures can stop these attacks to some extent, but they are not very effective due to various reasons. This leads to the development of intrusion detection as a second line of defense. Intrusion detection systems try to identify attacks or intrusions by analyzing network data (network-based systems) or operating system and application logs (host-based systems), possibly in real-time. These systems either search for patterns of well known attacks in the data (misuse detection) or try to find abnormalities in the data by first constructing the normal profile of the system under observation and then detecting deviations from this profile (anomaly detection). Anomaly detection is important due to the inability of misuse detection techniques in detecting unknown attacks.

In this thesis, we describe the design and implementation of an anomaly detection scheme for Sachet - A distributed, realtime, network-based intrusion detection system developed by us. In this scheme, the normal profile is constructed using learning techniques and stream handling techniques, from features extracted for each connection in the network traffic. Stream handling techniques are employed because the problem of constructing normal profile from feature vectors falls in the data stream class of problems. Several learning and stream handling techniques were tested on a benchmark data set and the best performing techniques were implemented in Sachet. The final system was tested on a benchmark dataset containing over 58 types of attacks.

smart paper boy · 11-08-2011, 12:17 PM

[attachment=15214]
Introduction
In the last few years there has been a tremendous increase in connectivity between systems which has brought about limitless possibilities and opportunities. Unfortu¬nately security related problems have also increased at the same rate. Computer systems are becoming increasingly vulnerable to attacks. These attacks or intru¬sions, based on flaws in operating system or application programs, usually read or modify confidential information or render the system useless. Formally, an intrusion is defined as any activity that violates the confidentiality, integrity or availability of the system.
Intrusion prevention is more desirable, but it cannot be fully achieved due to several reasons like unknown bugs in software, vast base of installed systems, abuse by insiders and human negligence. Many times it is difficult to have good access control while simultaneously making the system user friendly. Attacks are inevitable, but even after the attack has occurred, it is important to determine that the attack has happened, assess the extent of damage and track down the attacker. This helps in preventing future attacks. Due to these reasons, a detection system as a second line of defence is always desirable.
Intrusion detection systems (IDS) can be classified in two ways. The first one is based on the source of data being analyzed by the system. If the data is from operating system logs and application logs, it is called a 'host based' detection system; if the data is from network traffic, it is called a 'network based' detection system. Each method has its own advantages and disadvantages. For example, an attack by a local user cannot be detected by a network based system, but a denial of service attack can be detected more efficiently by a network based system. Thus each method is more efficient in detecting a particular class of attacks than the other.
The other classification is based on the detection method being used irrespective of the source of data. The main types in this classification are misuse detection systems and anomaly detection systems. In misuse detection, well known intrusions are represented by signatures. Each signature is a pattern of activity which corre¬sponds to the intrusion it represents, A detection system using such signatures is called a 'signature based' or a 'misuse detection' system. These detection systems search for patterns of intrusions in the data being analyzed. Thus misuse detection is basically a pattern matching process. Misuse detection systems are accurate and have a low false alarm rate, but they cannot detect unknown intrusions.
Anomaly detection systems assume that intrusions are anomalies or deviations from normal system activity. These detection systems try to capture the normal behaviour of the system (also called the normal profile), and then detect deviations from this normal behaviour. If this deviation is greater than a threshold, an alert is raised. Anomaly detection systems can detect unknown intrusions, but they have a high false alarm rate. There is generally a trade-off between detection rate and false alarm rate.
Several IDSs have been developed in the public and private domains using a variety of techniques and with varying features. Commercial IDSs mostly use sig¬nature based detection techniques. The features offered by them include scalability, real-time detection and a user friendly interface. Open source IDSs employ either misuse detection or anomaly detection or both. They offer features like scalability and real-time detection. For example, Snort [4], an open source IDS, employs misuse detection and is capable of doing real-time detection. Public domain research IDSs generally employ novel detection techniques. For example, ADAM [6] uses data mining techniques and IDES [17] uses statistical techniques.
Looking at the intrusion detection field from a research perspective, the research in misuse detection is focused mainly on writing signatures which encompass all possible variations of an attack without matching normal activity and on developing efficient methods of pattern matching. In anomaly detection, the main focus is on finding methods for representing the normal profile, selection of features used for constructing the profile and determining threshold levels so that most intrusions are detected while false alarms are minimized. In an overall system perspective, the focus of current research is on developing hybrid systems, i.e systems that are both network based and host based or that employ both anomaly detection and misuse detection,
1.1 Problem statement and Approach
In this thesis, we describe the design and implementation of a network based, real¬time anomaly detection scheme for the Sachet IDS, Sachet is a network based, real¬time, hybrid intrusion detection system developed at IIT Kanpur, Sachet employs both misuse detection and anomaly detection; hence it has the benefits of both the techniques, i.e. the accuracy of misuse detection systems in detecting known attacks, and the ability of anomaly detection systems in detecting unknown attacks. The Sachet IDS has agent based architecture with a central server. The detection is carried out at each agent and the results are aggregated at the server. The architecture is explained in more detail in Chapter 3, In the remaining part of this section, we describe the main issues involved in the thesis, followed by our approach. The main task in anomaly detection is to construct the normal profile of the system under observation. This profile should adapt to the changes in the system over time. It should also be small enough so that real-time detection is possible. The profile is generally constructed from a set of measures or features extracted from the data being analyzed. In this case, the features are extracted from the network packets sniffed at appropriate points in the network being monitored. One of the main issues here is feature extraction in real-time.
The construction of profile from feature vectors follows the data stream model; we have a continuous stream of feature vectors and the profile at any point should capture the information in the stream up to that point. If possible, the profile construction method should give more weight to newer data when compared with older data. Since the amount of network data is generally very large, any method used to construct the profile cannot obviously take the entire data seen in the stream so far, as input. Hence, efficiently dealing with the data stream is also a major issue here. Older data in the stream has to be discarded periodically, but the information in the discarded data has to be retained to some extent. Stream handling techniques have to be employed for this purpose. Finally, the detection technique has to be implemented in Sachet so that it requires minimal human intervention.
Our approach is as follows: the profile is learned from feature vectors using unsu-pervised learning (clustering) techniques. The features used for learning the profile are extracted for each connection in real-time, from the header and pavload parts of network packets sniffed at various points in the network. Features corresponding to the pavload part of the packet are extracted only for commonly used application layer protocols. These features are then aggregated at a single location, the Sa¬chet learning agent, and the profile of the entire network is learned offline. Stream handling techniques are used to deal with the continuous stream of feature vectors. These techniques can be viewed as wrappers around the learning techniques. They construct a synopsis of the stream seen so far, with the possible option that newer data is given more weight in this synopsis. Learning is then applied on this synopsis and the resulting profile is distributed to the detection points where deviations are detected and alerts are raised.
Two different unsupervised learning techniques, support vector clustering [7] and a modified k-means technique [14] were considered for learning the profile. To handle the feature vector stream, three different techniques, Divide-and-eonquer technique of clustering over data streams [15], reservoir sampling [25] and bootstrapping [16], were considered. The five valid combinations (a clustering technique and a stream handling technique) resulting from the above were tested on a benchmark data set. The combination that gave best results was implemented in the Sachet IDS, The implemented anomaly detection scheme was then tested on a benchmark data set of size 20GB, which contains over 50 attacks of various types

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	Service-Oriented Architecture for Weaponry and Battle Command and Control Systems in		1	1,063	15-02-2017, 03:40 PM Last Post: jaseela123d
	Exploring the design space of social network-based Sybil defenses		1	919	15-02-2017, 02:55 PM Last Post: jaseela123d
	SUSPICIOUS EMAIL DETECTION	seminar class	11	7,818	21-04-2016, 11:16 AM Last Post: dhanabhagya
	DATA LEAKAGE DETECTION	project topics	16	13,118	31-07-2015, 02:59 PM Last Post: seminar report asees
	An Acknowledgement-Based Approach for the Detection of routing misbehavior in MANETs	mechanical engineering crazy	2	2,969	26-05-2015, 03:04 PM Last Post: seminar report asees
	Design of Intranet Mail System	nit_cal	14	11,447	19-05-2015, 11:17 AM Last Post: seminar report asees
	An Acknowledgment-Based Approach For The Detection Of Routing Misbehavior In MANETs	electronics seminars	7	4,712	27-01-2015, 12:09 AM Last Post: Guest
	Design and Implementation of TARF: A Trust-Aware Routing Framework for WSNs	Projects9	6	3,576	10-01-2015, 11:13 PM Last Post: Guest
	Credit Card Fraud Detection Using Hidden Markov Models	alagaddonjuan	28	20,658	04-09-2014, 11:31 PM Last Post: Charlescic
	Digital Image Processing Techniques for the Detection and Removal of Cracks in Digiti	electronics seminars	4	4,880	22-07-2013, 09:37 PM Last Post: Guest

Important Note..!

ASK HERE