+ All Categories
Home > Documents > A Parametrizable Processor - Computation Structures...

A Parametrizable Processor - Computation Structures...

Date post: 05-Oct-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
17
5/17/07 6.375: Bichler, Carli, Yamhure 1 A Parametrizable Processor Olivier Bichler Roberto Carli Alessandro Yamhure
Transcript
Page 1: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 1

A Parametrizable Processor

Olivier BichlerRoberto Carli

Alessandro Yamhure

Page 2: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 2

Motivation

• Efficient system-on-a-chip solutions• Wide spectrum of requirements

– Performance– Area / Power

• Tuning without engineering

Page 3: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 3

Overview

• One-Rule Synchronous Processor• Combinational stages packaged as functions• Parameter-controlled configuration• Optimization (performance/area):

– Compiler macros– Aggressive compiler

• Results / tradeoffs

Page 4: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 4

One Rule to rule them all

• Explicit guards (WILL_FIRE)– FIFOF Methods– Data Memory– Writeback stage guard

• Customized SFIFO– First, find, find2, notFull, notEmpty, deq < enq < clear– Guards < Actions– Writeback < Execute– No Bypassing

Page 5: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 5

Packaging for Abstraction

• Achieve highly modular parameterization• Combinational workhorse unchanged

– Can be packaged/abstracted into functions– ActionValue functions

Page 6: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 6

3-stage packaged version

Page 7: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 7

2-stage version (no pcQ)

Page 8: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 8

2-stage version (no wbQ)

Page 9: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 9

1-stage version

Page 10: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 10

High-level schematics

Page 11: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 11

2-stage Exploration

• Instruction memory– Never skipped– High latency

• Data memory– Only used on memory LD/ST instructions– Pipeline causes stalls– On branches/jumps no need for writeback

• However, specific SOC tasks can benefit fromalternative solutions

Page 12: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 12

Test Strategy

• Custom Makefile– Scans and changes parameters automatically– Synthesizes various configurations– Reports IPC, IPS, clock period, area

Page 13: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 13

Results: Clock Period

Post-place+route effective clock period

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

w/o pcQ, w/owbQ

w/o pcQ, wwbQ

w pcQ, w/owbQ

w pcQ, wwbQ

Expectation:

Effective clock periodis expected to decreasewith the number ofpipelined stages, as wemake the critical pathshorter

Page 14: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 14

Results: IPS

0.00

10000000.00

20000000.00

30000000.00

40000000.00

50000000.00

60000000.00

70000000.00

80000000.00

90000000.00

IPS(median)

IPS (qsort) IPS(towers)

IPS (vvadd) IPS(multiply)

w/o pcQ, w/o wbQ

w/o pcQ, w wbQ

w pcQ, w/o wbQ

w pcQ, w wbQ

0.00

10000000.00

20000000.00

30000000.00

40000000.00

50000000.00

60000000.00

70000000.00

80000000.00

90000000.00

100000000.00

IPS (median) IPS (qsort) IPS (towers) IPS (vvadd) IPS(multiply)

w/o pcQ, w/o wbQ

w pcQ, w/o wbQ

w/o pcQ, w wbQ

w pcQ, w wbQ

Designexploration

non-EHR RegFile

EHR RegFile

Page 15: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 15

Results: AreaPost-synthesis total area

18000.00

19000.00

20000.00

21000.00

22000.00

23000.00

24000.00

w/o pcQ, w/owbQ

w/o pcQ, wwbQ

w pcQ, w/owbQ

w pcQ, wwbQ

Post-place+route total area

300000.00

310000.00

320000.00

330000.00

340000.00

350000.00

360000.00

370000.00

w/o pcQ, w/owbQ

w/o pcQ, wwbQ

w pcQ, w/owbQ

w pcQ, wwbQ

Expectation:

Area should increase with thenumber of pipelined stages

Post-place+route total area

325000.00330000.00335000.00340000.00345000.00350000.00355000.00360000.00365000.00370000.00375000.00

w/o pcQ, w/owbQ

w/o pcQ, wwbQ

w pcQ, w/owbQ

w pcQ, wwbQ

non-EHR RegFile

EHR RegFile

Page 16: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 16

Performance/Area TradeoffsIPS / Post-synthesis area

0

500

1000

1500

2000

2500

3000

3500

4000

w/o pcQ, w/o wbQ w/o pcQ, w wbQ w pcQ, w/o wbQ w pcQ, w wbQFOM = SQRT(IPS) / Post-synthesis area

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

w/o pcQ, w/o wbQ w/o pcQ, w wbQ w pcQ, w/o wbQ w pcQ, w wbQPowerArea

IPSFOM

×∝

Page 17: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00

5/17/07 6.375: Bichler, Carli, Yamhure 17

Conclusions

• Task-specific, customizable and flexibleprocessor design

• User-defined parameter controls degree ofparallelism / number of stages

• Each configuration optimized for area andperformance

• Balanced tradeoff between configurations


Recommended