Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | penelope-maxwell |
View: | 217 times |
Download: | 0 times |
Modular SRAM-based Binary Content-
Addressable MemoriesAmeer M.S. Abdelhadi and Guy G.F. Lemieux
Department of Electrical and Computer Engineering
University of British Columbia
Vancouver, Canadaa place of mindTHE UNIVERSITY OFBRITISH COLUMBIA
Binary Content-Addressable Memory (BCAM)
2/17
Hardware-based Single-Cycle Parallel Search EnginesWrite
Stores new data at specific address
MatchSearch all addresses for a
given data (pattern)
Found in ‘2’
BCDA
0123B C A M
Search for ‘D’
BCDA
0123B C A M
Store ‘C’ in ‘1’
BCAM Applications
Memory management
Associative caches
Translation lookaside
buffers (TLBs)
Databases
Eliminates memory
bottleneck
Networking
IP lookup in routing/
forwarding tables
Intrusion detection• detect predefined
suspicious packages
Packet Classification
Pattern matching
e.g. DNA sequence
lookup
Data compression
find and shorten
redundant patterns
Data encryption
find and encrypt specific patterns
3/17
Motivation - FPGAs
4/17
1000’sMemory
Blocks
100,000’sLogic
Elements
N o d e d i c a t e d B C A M r e s o u r c e s i n F P G A s
Objectives
BCAMs
Massively parallel memory search
Require high memory bandwidth
FPGAs
Block RAMs are main storage
Limited memory bandwidth
5/17
Use BRAMs to construct BCAMs
• Modular and flexible• Storage efficient• Single-cycle• Performance oriented
Associative Arrays
Algorithmic Heuristics
6/17
HashesSearch Trees:Tries, BSTs, …
Multi-unpredictable-cycle
Data dependent performance
Variable search depth Misses due to conflicts
Register-Based BCAM
Concurrent register read and compare
Single-cycle Limited resources
Complex routing
Fits small BCAMs
7/17
⌊log2CD⌋
WPatt
MIn
dc
Addr
ess
Dec
oder
WAddr
PW
=
MPatt
C D
=PW
PW
=
⌊log2CD⌋MAddr
Match
PWPWQ
Q
QEnD Reg0
EnD Reg1
EnD RegCD-1
Prio
rity
Enco
der
Brute-Force Transposed-Indicators-RAM (1)
A Traditional BRAM-based BCAM
8/17
Key idea: Transposed RAM - data becomes addressesWrite
Write ‘0’ to location ‘B’Match
Read location ‘D’ for match
‘2’
3012
ABCD‘D
’
3012
ABCD
‘0’ to ‘B’
* Xilinx App Notes
Brute-Force Transposed-Indicators-RAM (2)
Storing Data to Multiple Addresses• How can we store data to multiple addresses?• Specify addresses using one-hot coding• Each bit indicates a match or “store at location”
PROBLEM: Depth of CAM is limited by data width of RAM• e.g. to build 1M deep CAM, we need 1M bits wide• In FPGAs: 1000 BRAMs x 32bit wide = 32K deep CAM
9/17
BRAM-based Single-cycle Depth of CAM is limited by RAM width
1
3210
DCBA
BCAM Cascading
• PROBLEM:• Patterns are encoded as RAM addressesRAM depth is exponential to pattern width
• Solution: Cascading1. Divide pattern into smaller slices2. Search for each slice separately3. If all slices are found pattern match!RAM depth is linear to pattern width
10/17
RAM Depth = 2Pattern Width
RAM Depth = 2Slice Width x (Pattern Width / Slice Width)
CAM CAM CAM
SliceSlice SliceM a t c h e d P a tt e r n
Slic
e M
atch
……
…
Slic
e M
atch
Slic
e M
atch
Pattern Match
Hierarchical Search 2D BCAM (1)Narrow and Deep BCAM
11/17
Key idea: Hierarchical search1D BCAM 2D BCAM
4Mtoo
deep
2k
2k
Find a set (row) with match using a 1D BCAM
Search this set (row) in parallel for a specific match
Hierarchical Search 2D BCAM (2)Example
• Divide address space into segments• RAM: each segment in a line• Transposed-RAM: indicates
“pattern in segment?”
• Hierarchical Search:1. Find a row (segment) with
match using a 1D BCAM2. Search this row (segment)
in parallel for a specific match
12/17
0 0 0 00 0 1 11 0 0 00 1 0 0
0 1 2 30123
addresses
patt
erns2
311
0123ad
dres
ses
patterns
Transposed-RAM
RAM
Hierarchical Search 2D BCAM (2)Example
• Divide address space into sets• RAM: each segment in a line• Transposed-RAM: indicates
“pattern in segment?”
• Hierarchical Search:1. Find a row (segment) with
match using a 1D BCAM2. Search this row (segment)
in parallel for a specific match
11/176
0 0 0 00 0 1 11 0 0 00 1 0 0
0 1 2 30123
addresses
patt
erns2
311
0123ad
dres
ses
patterns
Transposed-RAM
RAM
Hierarchical Search 2D BCAM (2)Example
• Divide address space into sets• RAM: each set in a line• Transposed-RAM: indicates
“pattern in segment?”
• Hierarchical Search:1. Find a row (segment) with
match using a 1D BCAM2. Search this row (segment)
in parallel for a specific match
11/17
0 0 0 00 0 1 11 0 0 00 1 0 0
0 1 2 30123
addresses
patt
erns2 3
1 101se
ts
patterns
Transposed-RAM
RAM
Hierarchical Search 2D BCAM (2)Example
• Divide address space into sets• RAM: each set in a line• Transposed-RAM: indicates
“pattern in set?”
• Hierarchical Search:1. Find a row (segment) with
match using a 1D BCAM2. Search this row (segment)
in parallel for a specific match
11/17
0 00 111 01 0
0 10123
sets
patt
erns2 3
1 101
patterns
Transposed-RAM
RAMse
ts
Hierarchical Search 2D BCAM (2)Example
• Divide address space into sets• RAM: each set in a line• Transposed-RAM: indicates
“pattern in set?”
• Hierarchical Search:1. Find a set (row) with match
using a 1D BCAM2. Search this row (segment)
in parallel for a specific match
11/17
0 00 111 01 0
0 10123p
atter
ns2 31 1
01
patterns
Transposed-RAM
RAMse
ts
sets
Match pattern ‘3’
Hierarchical Search 2D BCAM (2)Example
• Divide address space into sets• RAM: each set in a line• Transposed-RAM: indicates
“pattern in set?”
• Hierarchical Search:1. Find a set (row) with match
using a 1D BCAM2. Search this set (row) in
parallel for a specific match
11/17
0 00 111 01 0
0 10123p
atter
ns2 31 1
01
patterns
Transposed-RAM
RAMse
ts
sets
Hierarchical Search 2D BCAM (3)Pros and Cons
Single match only
Cannot be
cascaded
RAM depth is exponential to pattern
width
Inefficient for wide patterns
12/17
BRAM-Based Single-cycle Efficient for deep CAMs
Indirectly-Indexed 2D (II2D) BCAM (1)
Cascadable Wide and Deep BCAM
13/17
PROBLEM: is it possible to regenerate matches for all addresses?
patte
rns
addresses1
1 11 1
1 1 1
M a t c h
I n d i c a t o r s
Indirectly-Indexed 2D (II2D) BCAM (1)
Cascadable Wide and Deep BCAM
13/17
PROBLEM: is it possible to regenerate matches for all addresses?
Key observationTransposed RAMis a sparse matrix
n columns (set of addresses)accommodates n matches (1’s) at most! patte
rns
addresses1
1 11 1
1 1 1
M a t c h
I n d i c a t o r s
Indirectly-Indexed 2D (II2D) BCAM (1)
Cascadable Wide and Deep BCAM
13/17
Key idea: use indirect indices to point to intra-set matchesCascadable
Scalable (linear growth)Supports wider patterns
PROBLEM: is it possible to regenerate matches for all addresses?
Key observationTransposed RAMis a sparse matrix
n columns (set of addresses)accommodates n matches (1’s) at most! patte
rns
S e t s
11
11
11
11
Indicators
Indices
Intra-set Match Indicators
Indirectly-Indexed 2D (II2D) BCAM (1)
Cascadable Wide and Deep BCAM
13/17
Key idea: use indirect indices to point to intra-set matchesCascadable
Scalable (linear growth)Supports wider patterns
PROBLEM: is it possible to regenerate matches for all addresses?
Key observationTransposed RAMis a sparse matrix
n columns (set of addresses)accommodates n matches (1’s) at most! patte
rns
S e t s
11
11
11
11
Indicators
Indices
BRAMLUTRAM
Intra-set Match Indicators
01
10
Indirectly-Indexed 2D (II2D) BCAM (2)
Example• Divide address space into sets• Store sets with a match in
Indicators-RAM• Transposed-RAM stores indices
to all matches in set• Hierarchical Search:• Find indices of all matching sets in
Transposed-RAM• Read Indicators-RAM using indices
from Transposed-RAM
14/17
0 0 0 00 0 0 1
0 01 0
0 1 2 30123
addresses
patte
rns
Transposed-RAM
2331
0123
patterns
RAM (reference)
addr
esse
s
01
10
Indirectly-Indexed 2D (II2D) BCAM (2)
Example• Divide address space into sets• Store sets with a match in
Indicators-RAM• Transposed-RAM stores indices
to all matches in set• Hierarchical Search:• Find indices of all matching sets in
Transposed-RAM• Read Indicators-RAM using indices
from Transposed-RAM
14/17
0 0 0 00 0 0 1
0 01 0
0 1 2 30123
addresses
patte
rns
Transposed-RAM
2331
0123
patterns
RAM (reference)
addr
esse
s
Indirectly-Indexed 2D (II2D) BCAM (2)
Example• Divide address space into sets• Store sets with a match in
Indicators-RAM• Transposed-RAM stores indices
to all matches in set• Hierarchical Search:• Find indices of all matching sets in
Transposed-RAM• Read Indicators-RAM using indices
from Transposed-RAM
14/17
0 0 0 00 0
1 0
0 0
0 1
0 1 2 30123
addresses
patte
rns
Transposed-RAM
Indicators-RAMs
0 11 0
2331
0123
patterns
RAM (reference)
addr
esse
s
Indirectly-Indexed 2D (II2D) BCAM (2)
Example• Divide address space into sets• Store sets with a match in
Indicators-RAM• Transposed-RAM stores indices
to all matches in set• Hierarchical Search:• Find indices of all matching sets in
Transposed-RAM• Read Indicators-RAM using indices
from Transposed-RAM
14/17
- --
1 0
-
0 1
0 10123
s e t s
patte
rns
Transposed-RAM
Indicators-RAMs
0 11 0
2331
0123
patterns
RAM (reference)
addr
esse
s
Indirectly-Indexed 2D (II2D) BCAM (2)
Example• Divide address space into sets• Store sets with a match in
Indicators-RAM• Transposed-RAM stores indices
to all matches in set• Hierarchical Search:• Find indices of all matching sets in
Transposed-RAM• Read Indicators-RAM using indices
from Transposed-RAM
14/16
- --
1 0
-
0 1
0 10123
s e t s
patte
rns
Transposed-RAM
Indicators-RAMs
0 11 0
2331
0123
patterns
RAM (reference)
addr
esse
s
Match pattern ‘3’
Indirectly-Indexed 2D (II2D) BCAM (2)
Example• Divide address space into sets• Store sets with a match in
Indicators-RAM• Transposed-RAM stores indices
to all matches in set• Hierarchical Search:• Find indices of all matching sets in
Transposed-RAM• Read Indicators-RAM using indices
from Transposed-RAM
14/17
- --
1 0
-
0 1
0 10123
s e t s
patte
rns
Transposed-RAM
Indicators-RAMs
0 11 0
2331
0123
patterns
RAM (reference)
addr
esse
s
Match pattern ‘3’
Indirectly-Indexed 2D (II2D) BCAM (2)
Example• Divide address space into sets• Store sets with a match in
Indicators-RAM• Transposed-RAM stores indices
to all matches in set• Hierarchical Search:• Find indices of all matching sets in
Transposed-RAM• Read Indicators-RAM using indices
from Transposed-RAM
14/17
- --
1 0
-
0 1
0 10123
s e t s
patte
rns
Transposed-RAM
Indicators-RAMs
0 11 0
2331
0123
patterns
RAM (reference)
addr
esse
s
Match pattern ‘3’
Found in ‘1’ and ‘2’0 1 1 00 1 2 3
Indirectly-Indexed 2D (II2D) BCAM (3)
Area and PerformanceExcept for very a narrow HS,II2D exhibits higher Fmax
register-based BCAMregister consumption
II2D linear ALM consumption;similar to other methods
HS exponential BRAM consumption
II2D linear BRAM consumption
II2D supports wider patterns15/17
00.5
11.5
22.5
3
9 18 27 36 45 54 63 72 81 90 99 108
117
126
135
144
153 9 18 27 36 45 54 63 72 9 18 27 36 9
16K 32k 64k 128k
M20
Ks (1
000'
s)
050
100150200250300350
ALM
s (10
00's)
0
100
200
300
400
500
Fmax
(MH
z)
Reg-basedBF-BCAMHS-BCAMII2D-BCAMDevice Limit
PW
CD
Indirectly-Indexed 2D (II2D) BCAM (3)
Area and PerformanceExcept for very a narrow HS,II2D exhibits higher Fmax
register-based BCAMregister consumption
II2D linear ALM consumption;similar to other methods
HS exponential BRAM consumption
II2D linear BRAM consumption
II2D supports wider patterns15/17
00.5
11.5
22.5
3
9 18 27 36 45 54 63 72 81 90 99 108
117
126
135
144
153 9 18 27 36 45 54 63 72 9 18 27 36 9
16K 32k 64k 128k
M20
Ks (1
000'
s)
050
100150200250300350
ALM
s (10
00's)
0
100
200
300
400
500
Fmax
(MH
z)
Reg-basedBF-BCAMHS-BCAMII2D-BCAMDevice Limit
PW
CD
Indirectly-Indexed 2D (II2D) BCAM (3)
Area and PerformanceExcept for very a narrow HS,II2D exhibits higher Fmax
register-based BCAMregister consumption
II2D linear ALM consumption;similar to other methods
HS exponential BRAM consumption
II2D linear BRAM consumption
II2D supports wider patterns15/17
00.5
11.5
22.5
3
9 18 27 36 45 54 63 72 81 90 99 108
117
126
135
144
153 9 18 27 36 45 54 63 72 9 18 27 36 9
16K 32k 64k 128k
M20
Ks (1
000'
s)
050
100150200250300350
ALM
s (10
00's)
0
100
200
300
400
500
Fmax
(MH
z)
Reg-basedBF-BCAMHS-BCAMII2D-BCAMDevice Limit
PW
CD
Indirectly-Indexed 2D (II2D) BCAM (3)
Area and PerformanceExcept for very a narrow HS,II2D exhibits higher Fmax
register-based BCAMregister consumption
II2D linear ALM consumption;similar to other methods
HS exponential BRAM consumption
II2D linear BRAM consumption
II2D supports wider patterns15/17
00.5
11.5
22.5
3
9 18 27 36 45 54 63 72 81 90 99 108
117
126
135
144
153 9 18 27 36 45 54 63 72 9 18 27 36 9
16K 32k 64k 128k
M20
Ks (1
000'
s)
050
100150200250300350
ALM
s (10
00's)
0
100
200
300
400
500
Fmax
(MH
z)
Reg-basedBF-BCAMHS-BCAMII2D-BCAMDevice Limit
PW
CD
Indirectly-Indexed 2D (II2D) BCAM (3)
Area and PerformanceExcept for very a narrow HS,II2D exhibits higher Fmax
register-based BCAMregister consumption
II2D linear ALM consumption;similar to other methods
HS exponential BRAM consumption
II2D linear BRAM consumption
II2D supports wider patterns15/17
00.5
11.5
22.5
3
9 18 27 36 45 54 63 72 81 90 99 108
117
126
135
144
153 9 18 27 36 45 54 63 72 9 18 27 36 9
16K 32k 64k 128k
M20
Ks (1
000'
s)
050
100150200250300350
ALM
s (10
00's)
0
100
200
300
400
500
Fmax
(MH
z)
Reg-basedBF-BCAMHS-BCAMII2D-BCAMDevice Limit
PW
CD
Indirectly-Indexed 2D (II2D) BCAM (3)
Area and PerformanceExcept for very a narrow HS,II2D exhibits higher Fmax
register-based BCAMregister consumption
II2D linear ALM consumption;similar to other methods
HS exponential BRAM consumption
II2D linear BRAM consumption
II2D supports wider patterns15/17
00.5
11.5
22.5
3
9 18 27 36 45 54 63 72 81 90 99 108
117
126
135
144
153 9 18 27 36 45 54 63 72 9 18 27 36 9
16K 32k 64k 128k
M20
Ks (1
000'
s)
050
100150200250300350
ALM
s (10
00's)
0
100
200
300
400
500
Fmax
(MH
z)
Reg-basedBF-BCAMHS-BCAMII2D-BCAMDevice Limit
PW
CD
Open Source
16/17
http://ece.ubc.ca/~lemieux/downloads/
Modular and parametric Verilog files
Run-in-batch simulation and
synthesis manager
Conclusions
17/17
Brute-Force Transposed-RAM BRAM-based
Single-cycle
Deep
Wide
Cascadable
patt
ern
s
a d d r e s s e s
Scalable M a t c h
I n d i c a t o r s
Conclusions
17/17
Hierarchical Search 2D BCAM BRAM-based
Single-cycle
Deep
Wide
Cascadable
patt
ern
s
s e t s
Scalable S e t Matc h
I n d i c a t o r s
Conclusions
17/17
Indirectly-Indexed 2D (II2D) BCAMBRAM-based
Single-cycle
Deep
Wide
Cascadable
patt
ern
s
s e t s
Scalable
Intra-set Match Indicators
I n d i c a t o r s
I n d i c e sMulti-unpredictable-cycle
Thank You!
Backup Slides
16
II2DHierarchicalBrute-Force
patte
rns
addresses
MatchIndicators
patte
rns
s e t s
Match
Indicat
ors
patte
rns
s e t s
Indicat
ors
Indices
Match IndicatorsBRAM-based
Single-cycle Deep Wide
Conclusions