HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification
Author: Wenjun Li, Xianfeng LiPublisher: 2013 IEEE 21st
Annual SymposiumPresenter: Ching Hsuan ShihDate: 2013/12/11
Outline
I. IntroductionII.BackgroundIII.Related WorkIV.HybridCutsV.Experimental Results
I. IntroductionTwo major approaches to packet classification:• Architectural: TCAM• Advantage: process packets at line speed• Disadvantage: expensive, high power consumption, rule
duplication• Algorithmic: • Decision-tree: Hicuts, HyperCuts, and EffiCuts• Decomposition: BV
I. Introduction (Cont.)Proposed scheme:• HybridCuts: a combination of decomposition and
decision-tree techniques•A rule set decomposition algorithm based on the observation that most rules have at least one small field.•A novel one-dimensional cutting algorithm called FiCuts.•A two-stage cutting framework which combines one-dimensional cutting and multi-dimensional cutting techniques.
II. BackgroundA. The Packet Classification Problem• To find a matching rule from a packet classifier for a
packet.• A classifier is a set of rules, with each rule R
consisting of a tuple of F field values, and an action to be taken in case of a match.
II. Background (Cont.)B. Complexity in Theory• From a geometric point of view, for N non-overlapping
hyper-rectangles in F-dimensional space:• The best bounds for locating a point are either Θ(log N) time
with Θ(NF) space, or Θ(logF-1 N) time with Θ(N) space.• Impractical: with just 1000 rules and 4 fields, a solution is
either impracticably (NF is about 1000G) or too slow (logF-1 N is about 1000 memory accesses).
II. Background (Cont.)C. Complexity in Practice• It is infeasible to design a single algorithm that can perform
well in all cases.• Packet classification rules in real-life have some inherent
characteristics that can be exploited to reduce the complexity.• The protocol field is restricted to a small set of values.• Rules specify a limited number of distinct transport port ranges.• The number of address prefixes matching a given address is typically
five or less.• The number of rules matching a given packet is typically five or less.• Many different rules share the same field values.
III. Related WorkA. Parallel Bit-Vector (BV)• One of most representative decomposition based
solutions.• It works on the individual fields of rules independently for
partially matching results.• Then a bit-wise AND operation on all bit vectors is
performed to get the final result.
III. Related Work (Cont.)
III. Related Work (Cont.)
B. HiCuts and HyperCuts• Take a geometric view of the packet classification problem.• HiCuts chooses one dimension to cut at one time.• HyperCuts chooses multiple dimensions to cut at one time.
III. Related Work (Cont.)C. EffiCuts• Separate rule set into several subsets, depending on
whether the value of each dimension is wildcard (or almost wildcard).• Each subset creates its own decision-tree independently
using HyperCuts.
III. Related Work (Cont.)
D. ParaSplit• Employs a complex heuristic for rule set partitioning.• It is different from EffiCuts in that its objective is for
efficient hardware implementation using FPGAs.
IV. HybridCutsA. Decomposition-based FrameworkDefinitions:• Given an N-dimensional rule R = (F1, F2, F3, …, FN), Leni represents the
length of field Fi , and a threshold value vector T = (T1, T2, T3, …, TN). • We call Fi is a small field if Leni≦Ti. Then we define the following
concepts for R:• Big rule: ∀i ∈{1, 2, 3 ... N}, Fi in R is a big field• Small rule: ∃i ∈{1, 2, 3 ... N}, Fi in R is a small field
IV. HybridCuts (Cont.)• We decompose a 5-tuple rule set into the following five subsets
without duplicates among each other:• 1. Big-subset: SA, DA, SP and DP are all big field• 2. SA-subset: SA is a small field for each rule• 3. DA-subset: DA is a small field for each rule• 4. SP-subset: SP is a small field for each rule• 2. DP-subset: DP is a small field for each rule
• When processing a rule R with two small fields: SA and DA, suppose the sizes of the SA-subset and the DA-subset up to now are N1 and N2 respectively, then rule R will go to SA-subset if N1≦N2, or to DA-subset vice versa.
IV. HybridCuts (Cont.)
IV. HybridCuts (Cont.)
B. FiCuts: A One-Dimensional Cutting Technique• Simplicity: FiCuts conducts cuttings on the subset along a fixed dimension• Adaptivity: FiCuts can decide when to stop FiCuts and resort to other
more effective cutting methods
How to decide np:spfac * number of rules at node r ≥ ∑ number of rules at each child of node r + np
IV. HybridCuts (Cont.)binth = 4 spfac = 2
IV. HybridCuts (Cont.)C. Multi-dimensional Cutting• With the shrinking of the search space, rule replications begin to
rise with fine cuts.• FiCuts makes the decision. If the number of cuts np at a node is
less than a predefined MAXCUTS, FiCuts resorts to HyperCuts
IV. HybridCuts (Cont.)binth = 2spfac = 2
IV. HybridCuts (Cont.)D. Optimization• Most rules have at least one small IP address field (SA or DA).• Based on this observation, we reduce the number of decision trees.
For a 5-tuple rule set, we decompose it into three subsets:• 1. Big-subset: Both SA and DA are big field• 2. SA-subset: SA is a small field for each rule• 3. DA-subset: DA is a small field for each rule
IV. HybridCuts (Cont.)
Depends on SA, DA, SP, and DP
Depends on SA, and DA
V. Experimental ResultsA. Memory Consumption
V. Experimental Results (Cont.)
A. Memory Accesses
V. Experimental Results (Cont.)C. Potential of Parallelization• It can be seen that in most cases, the worst-case height of a single
tree is half of or less than the overall number of memory accesses on all the trees. This means that the trees constructed are balanced in height among each other, amenable to a parallel implementation with a potential 2x speedup.