+ All Categories
Transcript
Page 1: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

A Multi-Ported Memory Compiler Utilizing True

Dual-port BRAMs

Ameer Abdelhadi and Guy LemieuxDepartment of Electrical and Computer Engineering

University of British Columbia

Vancouver, Canadaa place of mindTHE UNIVERSITY OFBRITISH COLUMBIA

May 3rd, 2016

Page 2: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Motivation (1):FPGAs as parallel accelerators

•Used as parallel acceleratorsHave dual-ported memories only

1/20

1000’sDual-PortedBlock RAMs

1,000,000’sLogic

Elements

1000’sMultipliers/

DSPs

Page 3: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Motivation (2)Mixed port requirements

/√ xf

g1/x

ALU

f/g

0 1

>>

busr/w

Shared bus

R1,0R0,0 R2,0

•Multi-porting approaches provide simple (fixed) ports only

•Waste of resources if these ports are not active simultaneously

2/20

Page 4: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Live-Value Table (LVT)

Multi-read

Replication

LVT with2 write and1 read ports

2-port RAM1

2-port RAM2

Multi-write

LVT

W0

W1

R

2-port RAM1

2-port RAM2

W R0

R1

Easy!

Hard!

10

1

1

Always writes 0’s

Always writes 1’s

3/20

Page 5: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Data Banks Optimization

• LVT-based multi-ported RAM is composed of:

1) LVT - tracks changes2) Data banks - stores data copies

•Our previous work (I-LVT/ FPGA’14) optimizes LVT only•This work

• Optimizes the data banks (not the LVT!)

• The first technique that requires a CAD tool

4/20

Data Banks

RAM 01 Write/nR Read

WA

ddr

RA

ddr

RAM 11 Write/nR Read

RAM nW-11 Write/nR Read

LVTBankSel

Page 6: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Data Banks Optimization

• LVT-based multi-ported RAM is composed of:

1) LVT - tracks changes2) Data banks - stores data copies

•Our previous work (I-LVT/ FPGA’14) optimizes LVT only•This work

• Optimizes the data banks (not the LVT!)

• The first technique that requires a CAD tool

4/20

Data Banks

RAM 01 Write/nR Read

WA

ddr

RA

ddr

RAM 11 Write/nR Read

RAM nW-11 Write/nR Read

LVTBankSel

Page 7: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Data Banks Optimization

• LVT-based multi-ported RAM is composed of:

1) LVT - tracks changes2) Data banks - store data copies

•Our previous work (I-LVT/ FPGA’14) optimizes LVT only•This work

• Optimizes the data banks (not the LVT!)

• The first technique that requires a CAD tool

4/20

Data Banks

RAM 01 Write/nR Read

WA

ddr

RA

ddr

RAM 11 Write/nR Read

RAM nW-11 Write/nR Read

LVTBankSel

Page 8: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Data Banks Optimization

• LVT-based multi-ported RAM is composed of:

1) LVT - tracks changes2) Data banks - store data copies

•Our previous work (I-LVT/ FPGA’14) optimizes LVT only•This work

• Optimizes the data banks (not the LVT!)

• The first technique that requires a CAD tool

4/20

Data Banks

RAM 01 Write/nR Read

WA

ddr

RA

ddr

RAM 11 Write/nR Read

RAM nW-11 Write/nR Read

LVTBankSel

Page 9: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Data Banks Optimization

• LVT-based multi-ported RAM is composed of:

1) LVT - tracks changes2) Data banks - store data copies

•Our previous work (I-LVT/ FPGA’14) optimizes LVT only•This work

• Optimizes the data banks (not the LVT!)

• The first technique that requires a CAD tool

4/20

Data Banks

RAM 01 Write/nR Read

WA

ddr

RA

ddr

RAM 11 Write/nR Read

RAM nW-11 Write/nR Read

LVTBankSel

Page 10: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Data Banks Optimization

• LVT-based multi-ported RAM is composed of:

1) LVT - tracks changes2) Data banks - store data copies

•Our previous work (I-LVT/ FPGA’14) optimizes LVT only•This work

• Optimizes the data banks (not the LVT!)

• The first technique that requires a CAD tool

4/20

Data Banks

RAM 01 Write/nR Read

WA

ddr

RA

ddr

RAM 11 Write/nR Read

RAM nW-11 Write/nR Read

LVTBankSel

Page 11: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Data Banks Optimization

• LVT-based multi-ported RAM is composed of:

1) LVT - tracks changes2) Data banks - store data copies

•Our previous work (I-LVT/ FPGA’14) optimizes LVT only•This work

• Optimizes the data banks (not the LVT!)

• The first technique that requires a CAD tool

4/20

This work solves the final step and most important problem of Block RAM allocation

Data Banks

RAM 01 Write/nR Read

WA

ddr

RA

ddr

RAM 11 Write/nR Read

RAM nW-11 Write/nR Read

LVTBankSel

Page 12: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Mixed Port Requirements (1):Fixed ports

5/20

/√ xf

g1/x

ALU

f/g

0 1

>>

busr/w

Shared bus

R1,0R0,0 R2,0

Fixed (simple) ports:The majority of multi-ported memories supports fixed ports only

Page 13: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Mixed Port Requirements (2):True ports

6/20

/√ xf

g1/x

ALU

f/g

0 1

>>

busr/w

Shared bus

R1,0R0,0 R2,0

True ports:Some techniques support the construction of multi-true-ports

BRAMs in FPGAs are true dual-ported

Page 14: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Mixed Port Requirements (3):Switched ports

7/20

/√ xf

g1/x

ALU

f/g

0 1

>>

busr/w

Shared bus

R1,0R0,0 R2,0

Switched ports:A number of writes are switched with a number of reads

True ports are special case of switched ports

Page 15: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (1)Example

8/20

/√ xf

g1/x

ALU

f/g

0 1

>>

busr/w

Shared bus

R1,0R0,0 R2,0

Key Observation:BRAMs’ true ports can be utilized to optimize switched ports

Objectives:Optimize the construction of multi-switched ports

Page 16: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (2)Fixed ports abstraction

9/20

/

Fixed Ports

√ xfg1/x

ALU

f/g

0 1

>>

R1,0R0,0

busr/w

Shared bus

R2,0

Page 17: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (3)Fixed data banks

10/20

/

Fixed Ports

√ xfg1/x

ALU

f/g

0 1

>>

R1,0R0,0

busr/w

Shared bus

R2,0

I-LVT

Page 18: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (4)DFG modeling

11/20

Complete Bigraph

Vertex Port

Edge 1W1R BRAM

/

Fixed Ports

√ xfg1/x

ALU

f/g

0 1

>>

R1,0R0,0

busr/w

Shared bus

R2,0

Page 19: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (5)Switched DFG

12/20

Complete Bigraph

Vertex Port

Biclique pattern BRAM

/√ xf

g1/x

ALU

f/g

0 1

>>

busr/w

Shared bus

R1,0R0,0 R2,0

Page 20: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (6)DFG Covering

13/20

Complete Bigraph

Vertex Port

Biclique pattern BRAM

W1 R1

W2 R2

Page 21: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (6)DFG Covering

13/20

Complete Bigraph

Vertex Port

Biclique pattern BRAM

W1 R1

W2 R2

W1 R1

R2

Page 22: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (6)DFG Covering

13/20

Complete Bigraph

Vertex Port

Biclique pattern BRAM

W1 R1

W2 R2

W1 R1

R2

W R

Page 23: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (6)DFG Covering

13/20

Complete Bigraph

Vertex Port

Biclique pattern BRAM

W1 R1

W2 R2

W1 R1

R2

W R

W R

Page 24: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (6)DFG Covering

13/20

Complete Bigraph

Vertex Port

Biclique pattern BRAM

W1 R1

W2 R2

W1 R1

R2

W R

W R

W R

Page 25: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (6)DFG Covering

13/20

Complete Bigraph

Vertex Port

Biclique pattern BRAM

W1 R1

W2 R2

W1 R1

R2

W R

W R

W R

W R

Page 26: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (6)DFG Covering

13/20

Complete Bigraph

Vertex Port

Biclique pattern BRAM

W1 R1

W2 R2

W1 R1

R2

W R

W R

W R

W R

W R

Page 27: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (6)DFG Covering

13/20

Complete Bigraph

Vertex Port

Biclique pattern BRAM

W1 R1

W2 R2

W1 R1

R2

W R

W R

W R

W R

W R

W R

Page 28: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (7)Switched data banks

14/20

W1 R1

W2 R2

W1 R1

R2

W R

W R

W R

W R

W R

W R

Page 29: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (7)Switched data banks

14/20

W1 R1

W2 R2

W1 R1

R2

W R

W R

W R

W R

W R

W R

I-LVT

Page 30: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Switched Ports (7)Switched data banks

14/20

W1 R1

W2 R2

W1 R1

R2

W R

W R

W R

W R

W R

W R

I-LVT I-LVT

Fixed Ports(Complete Bigraph)

Switched Ports (Optimized Bigraph)

12 BRAMs8 BRAMs(33% reduction)

Page 31: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Multi-switched-ports Compiler

•A RAM compiler optimizes data banks construction•Generates DFG from port requirements•Solves set-covering problem on all edges

• Covers are predefined biclique patterns• Solved as BLP problem

•Generates Verilog modules based on optimal covering

15/20

Available as open source contributionhttps://github.com/AmeerAbdelhadi

http://www.ece.ubc.ca/~lemieux/downloads/

Supports bypassing (RAW & RDW) and Initialization

Page 32: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Graphical User Interface (GUI)

16/20

Page 33: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Source of inspiration:Multi-True-Ports by Choi et al. / UofT• Provides true ports only (no simple/fixed ports)

• Is a special case of our generalized approach

• Doesn't need a CAD tools

17/20

RAM

R/W

Data

3

S3

n Read / n WriteRegister-based

LVT

S0 S1 S2 S3 Sn-1

R/WData1

RAM

3 Read / 3 WriteRegister-based

LVT

Page 34: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Experimental Results

•Run-in-batch flow manager•Uses Altera’s Quartus II for synthesis on Stratix V•Uses Altera’s ModelSim for verification with:

• Random vectors • Over a million RAM access cycles

•Results on random test-cases•Up to 8 switched ports•Up to 4 writes and 4 reads per switched port•Up to 28 writes/reads per test-case

18/20

Average BRAM Reduction Average ALMs Reduction Average Fmax Increase

Best of Previous 18% -3% -1%

True Ports 42% 53% 15%

Page 35: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Conclusions

•A methodology to support switched write/read functionality•True dual-ported BRAMs are utilized to optimize the

RAM allocation•A RAM compiler optimizes the problem•An additional 18% average BRAM reduction

compared to the best of other approaches•Practical solution:• Initialization• Bypassing• Available as open source

19/20

Page 36: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Future Directions

•Applications• Parallel computation• HLS – storage binding

•Optimization of switched ports port assignment• Extraction of mutually-exclusive functions from HDL

•Statistical approach• Ports which are mutually-exclusive in most cases can use

a switched port• Access conflicts will be rare

20/20

Page 37: A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMslemieux/publications/presentations/abdelhadi-fcc… · A Multi-Ported Memory Compiler Utilizing True Dual-port BRAMs Ameer

Thank You!


Top Related