ASK HERE

Wifi · 28-10-2010, 09:48 PM

Packet classification algorithms

[attachment=6959]
Introduction
The process of categorizing packets into flows in an internet router is called packet classification.Packet classification is needed for non best-effort services, such as firewalls and quality of service, service s that require the capability to distinguish and isolate traffic in different flows for suitable processing. Packet classification can be done on a single field or multiple fields. There are a variety of algorithms proposed for packet classification. They can be broadly categorized as basic data structures / search algorithms, geometric algorithms, heuristic algorithms and hardware specific algorithms. Flows are specified by rules applied to incoming packets. We call a collection of rules a classifier. Each rule specifies a flow that a packet may belong to based on some criteria.
Algorithm
Algorithms and techniques that a flow-aware router uses to classify packets into flows (packet classification. The Flow-aware router keeps track of flows and perform similar processing on packets in a flow . While flow-unaware router (packet-by-packet router) treats each incoming packet individually.
Packet classification can be done on a single field or on multiple fields . Fields used for classification can be Data Link Layer (Layer 2) Source, Destination address, Network Layer (Layer 3) Source, Destination Address, Transport Layer (layer 4) Source, Destination Port number and Layer Protocol.

REQUIREMENTS OF PACKET CLASSIFICATION ALGORITHM
The requirements of packet classification can vary depending up on the application and the location where the classification is performed. The classification should be done at the line speeds with minimum time required for classification and minimum memory .

The requirements can be listed as under:
• Number of rules to be supported: Number of rules for PC can vary depending on applications such as firewalls, backbone router etc.
• Number of fields used: PC applications differ on basis of the fields in IP header used for classification.
• Nature of rules: The rules are about the prefix mask or general masks used on destination IP address.
• Updating the set of rules: The PC should be adaptable changes in the rules due to route or policy changes without sacrificing access performance
• Worst case Vs average case : For performance of PC, focusing should be done on worst case than average case.
All these algorithms can be categorized in four classes as under:
A. Basic Data Structures: The algorithms based on data structures are Linear Search, Caching, Hierarchical Tries, Set Pruning Tries, etc.
B. Geometry Based: The algorithms based on geometry are Grid of Tries, Area Based Quad Tree , Fat Inverted Segment Tree, etc.
C. Heuristic: The algorithms based on heuristic category are Recursive Flow Classification, Hierarchical Intelligent Cutting , Tuple Space Search, etc.
D. Hardware: The algorithms based on hardware category are Ternary CAM, Bitmap-Intersection, etc.
Basic Data Structure/ Search Algorithms:
A.Linear Search
Prefixes are kept in a linked list.Every arriving packet is compared with each rule sequentially until a rule is found that matches all required fields i.e. each rule is evaluated sequentially until a rule is found that matches all the relevant fields in the packet header.O(N) storage, O(N) lookup time, O(1) update complexity Improve average time by keeping linked list sorted in order of prefix lengths
Hierarchical Tries:
A radix trie is a binary tree with labeled branches and traversed during search operations using individual bits of the search key. The left branch is labeled zero and right branch is labeled one. A node represents a bit string formed by concatenating labels of all branches in the path from the root node. A radix trie is a binary tree with labeled branches and traversed during search operations using individual bits of the search key. The left branch is labeled zero and right branch is labeled one. A node represents a bit string formed by concatenating labels of all branches in the path fromthe root node

Set-pruning Tries
The data structure query time is reduced due to elimination of need to do recursive traversals. This is achieved by replicating rules at several nodes in the data structures. A set pruning trie has rules in (d-l) dimensional trie linked to a prefix p in FI trie are pushed down. This pushing down of prefixes is carried out recursively on the remaining (d-l ) dimensions in the set pruning trie data structure. The query algorithm for an incoming packet (v.. V2, ... ,Vd) only traverse the Fl-trie to find the longest matching prefix of VI, follow its next-trie pointer (if non null), traverse the F2-trie to find the longest matching prefix of v1 and so on for all dimensions . B. Geometry Based Algorithms:
Grid-of- Tries:
It is an optimization of hierarchical trie data strucure for 2 dimensions. Itavoids memory blowup of set pruning tries by allocating a rule to only one trie node as in hierarchical tries but still achieves O(W) query time by using precomputation and storing a switch pointer in some trie nodes.

Recursive Flow Classification
Classifying a packet can be viewed as mapping S bits in the packet header to T bits of class lD (an identifier denoting the rule or action) where, T =log N, T« S in a manner dictated by the classifier rules. A simple and fast, but unrealistic way of doing this mapping is to precompute the value of class lD for each of the 2s different packet header values. It is difficult to achieve both high classification rate and reasonable storage in the worst case. Real classifiers exhibit structure and redundancy. A practical scheme could exploit this structure and redundancy.
The algorithm operates as follows:
• In the first phase (phase 0), d fields of the packet header arc split up into multiple chunks that are used to index into multiple memories in parallel.
• In subsequent phases, the index into each memory is formed by combining the results of the lookups fromearlier phases.
• After successive combination and reduction, one result is received from the memory lookup, in the final phase.Because of the way the memory contents have been precomputed, this value corresponds to the class ID of the packet .

Hierarchical Intelligent Cuttings
HiCuts build a decision tree data structure such that the internal nodes contain suitable information to guide the classification algorithm, and the external nodes (i.e., the leaves of the tree) store a small number of rules. On receiving an incoming packet, the classification algorithm traverses the decision tree to arrive at a leaf node.The algorithm then searches the rules stored at this leaf node sequentially to determine the best matching rule. The shape characteristics of the decision tree such as its depth, the degree of each node and the local search decision to be made by the query algorithm at each node, are chosen while preprocessing the classifier and depend on the characteristics of the classifier.

HiCuts algorithm in d dimensions can be formulated as follows:
Each internal node, v, of the tree represents a portion of the geometric search space and the root node represents the Perhaps an algorithm can exploit this structure A heuristic hybrid scheme.
HiCuts: Pros and Cons
Advantages
 Exploits structure of real-life classifiers
 Adapts data structure
 Suitable for multiple fields
 Supports incremental updates
Disadvantages
 Depends on structure of classifiers
 Large pre-processing time
 Large worst-case storage requirements
TCAM
Hardware implementation comes from the need for higher packet processing capacity that is typically not obtainable by software implementations. TCAMs can be used for multi-dimensional classification with each row of the TCAM memory array to be wider than 32 bits, the required width depends on the number of fields used for classification , and usually varies between 128 and 256 bits depending on the application.

Advantages
 Suitable for multiple fields
 Fast: 16-20 ns (50-66 Mpps)
 Simple to understand
Disadvantages
 Inflexible: range-to-prefix blowup
 Density: largest available in 2000 is 32K x 128 (but can be cascaded)
 Management software, and on-chip logic: non-trivial complexity
 Power: 5-8 W
 Incremental updates: slow
 DRAM-based CAMs: higher density but soft-error is a problem

CONCLUSION
Linear search is a simple algorithm but poor at scaling. Hierarchical trie provides a good worst case query time at the expense of storage. Set pruning tries has reduced query time and has fast retrieval at the cost of storage. Grid of tries rebuilds each update and can be used for last 2 dimensions of a multidimensional hierarchical trie. RFC is suitable for small set of rules, uses preprocessing and large storage space. In HiCuts parameters can be tuned to trade-off query time against storage requirements. TCAM is simple but very costly and is good for small classifiers .

[attachment=6959]

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	General Packet Radio Service (Download Full Seminar Report)	Computer Science Clay	10	15,857	22-03-2014, 12:46 PM Last Post: MichaelPn
	High Speed OFDM Packet Access (HSOPA)	computer science crazy	2	10,502	08-12-2012, 02:44 PM Last Post: seminar details
	High-Speed Downlink Packet Access (HSDPA)	shibin.sree	1	9,217	08-12-2012, 02:44 PM Last Post: seminar details
	High Speed Packet Access	seminar surveyer	1	9,172	08-12-2012, 02:44 PM Last Post: seminar details
	AI-based Classification and Retrieval of Reusable Software Components	computer girl	0	1,090	11-06-2012, 12:07 PM Last Post: computer girl
	Text Classification from Labeled and Unlabeled Documents using EM	computer girl	0	834	09-06-2012, 11:28 AM Last Post: computer girl
	Controlling IP Spoofing Through Inter-Domain Packet Filters	seminar surveyer	1	2,562	29-02-2012, 12:51 PM Last Post: seminar paper
	Resilient Packet Ring Technology	computer science crazy	1	2,010	20-02-2012, 10:43 AM Last Post: seminar paper
	libpcap [Packet Sniffing for Security ]	seminar class	1	1,620	10-02-2012, 09:50 AM Last Post: seminar addict
	PACKET SNIFFING WITH ANTI STUFF	seminar projects crazy	1	8,730	10-02-2012, 09:50 AM Last Post: seminar addict

Important Note..!

ASK HERE