+ All Categories
Home > Documents > Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

Date post: 14-Jan-2016
Category:
Upload: bona
View: 43 times
Download: 0 times
Share this document with a friend
Description:
Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath). F. Mehdipour *, Hiroaki Honda ** , H. Kataoka*, K. Inoue* and K. Murakami* *Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan - PowerPoint PPT Presentation
Popular Tags:
42
Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath) F. Mehdipour*, Hiroaki Honda**, H. Kataoka*, K. Inoue* and K. Murakami* *Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan **Institute of Systems, Information Technologies and Nanotechnologies (ISIT), Fukuoka, Japan E-mail: farhad @c.csce.kyushu-ua.c.jp
Transcript
Page 1: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

F. Mehdipour*, Hiroaki Honda**, H. Kataoka*, K. Inoue*and K. Murakami*

*Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan

**Institute of Systems, Information Technologies and Nanotechnologies (ISIT), Fukuoka, Japan

E-mail: [email protected]

Page 2: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

2SSV 2009Kyushu University

CREST-JST (2006~): Low-power,high-performance, reconfigurable processor using single-flux quantum circuits

SFQ-LSRDP

Prof. K. MurakamiDr. K. InoueDr. H. Honda

Dr. F. MehdipourH. Kataoka

Kyushu Univ.Architecture, Compiler

and Applications

Dr. S. Nagasawa et al.

Superconducting Research Lab. (SRL)

SFQ process

Prof. N. Yoshikawa et al.

Yokohama National Univ.SFQ-FPU chip, cell library

Prof. A. Fujimaki et al.

Nagoya Univ.SFQ-RDP chip, cell library,

and wiring

Prof. N. Takagi (Leader) et al.

Nagoya Univ.CAD for logic design and arithmetic circuits

Page 3: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

3SSV 2009Kyushu University

Agenda

Introduction Large-Scale Reconfigurable Data-Path (LSRDP)

General Architecture and Specifications Design Procedure and Tool Chain Preliminary Results Conclusions and Future Work

Page 4: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

4SSV 2009Kyushu University

Introduction

For performance improvement various accelerators are used with GPPs PowerXcell, GPU, GRAPE-DR, ClearSpeed, etc. Small size and low power consumption comparing to processors with similar

performance

NVIDIA Tesla S1070http://www.nvidia.com

Page 5: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

5SSV 2009Kyushu University

Acceleration Through a Data-Path Processor

Mechanism Acceleration by using a data-path accelerator Augmenting the accelerator to the base processor Executes hot portions of applications on the accelerator

Page 6: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

6SSV 2009Kyushu University

How a Reconfigurable Processor Works

Application codeMain

Memory

GPPcritical code

Non-critical code

critical code

.

.

.

Non-critical code

Non-critical code

LSRDP

Page 7: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

7SSV 2009Kyushu University

Coupling an Accelerator to a Processor

CoprocessorCoprocessorProcessorProcessor

RFURFU

MemoryMemory

Attached ProcessorAttached Processor

BridgeBridge

Tight Coupling

Loose Coupling

Tight Coupling

Page 8: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

8SSV 2009Kyushu University

Motivation

Conventional accelerators: A large memory bandwidth is demanded in conventional

accelerators for high-performance computation

On chip memories are often used to hide memory access latency

Large-Scale Reconfigurable Data-Path (LSRDP): • is introduced as an alternative accelerator• reduces the no. of memory accesses by utilizing data-path

Page 9: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

9SSV 2009Kyushu University

Outline of Large-Scale Reconfigurable Data-Path (LSRDP) processor

Features:Data Flow Graphs (DFGs) extracted

from critical calculation parts are directly mapped

Pipeline executionBurst transfer is used for input /output

rearranged data from/to memoryMainMemory

GPP

ORN

: : : :

ORN : Operand Routing Network

...FU FU FUFU

...FU FU FUFU

...FU FU FUFU

LSRDP

: : : ... :SB

SMAC

Scratchpad Memory

Reconfigurable data-path includes:A large number of floating point

Functional Units (FUs)Arranged as arrays

Reconfigurable Operand Routing Network : (ORN)

Dynamic reconfiguration facilitiesStreaming Buffer (SB) for I/O ports

Page 10: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

10SSV 2009Kyushu University

Single-Flux Quantum (SFQ)against CMOS

CMOS issues: (if LSRDP has 32x32 FUs) high electric power consumption high heat radiation and difficulties in high-density packing

SFQ Features: High-speed switching and signal transmission Low power consumption Compact implementation of a system (small area) No cost for latch Suitable for pipeline processing of data stream Serial bit-level processing

ジョセフソン接合

超伝導ループ

磁束量子Single Flux QuantumSuperconductivityloop

Josephson junctionジョセフソン接合

超伝導ループ

磁束量子

ジョセフソン接合

超伝導ループ

磁束量子

ジョセフソン接合

超伝導ループ

磁束量子Single Flux QuantumSuperconductivityloop

Josephson junction

Page 11: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

11SSV 2009Kyushu University

Goals of the Project

Discovering appropriate scientific applications

Developing compiler tools

Developing performance analyzing tools

Designing and Implementing SFQ-LSRDP architecture Designing and Implementing SFQ-LSRDP architecture considering the features and limitations of SFQ circuitsconsidering the features and limitations of SFQ circuits

Page 12: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

LSRDP General Architecture and Specifications

Page 13: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

13SSV 2009Kyushu University

Parameters Should Be DecidedWithin the LSRDP Design Procedure

Height

PE1 ...

...

...

PEm...

.

.

.

.

.

.

.

.

.

PE2 PE3

ORN

ORN

Width

...

...

Streaming Buffer (SB)

ORN

Operand Routing Network (ORN)

Streaming Buffer (SB)

Maximum Connection Length (MCL)between consecutive rows?(impossible to implement full cross bar)

• PE: combination of a Functional Unit (FU) and a data Transfer Unit (TU)

Reconfiguration mechanism?(PE, ORN, Immediate data)

Layout: FU types(ADD/SUB and MUL)?

• Core structure: a rectangular matrix of PEs

Width and Height ?

• On-chip memory configuration?

Page 14: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

14SSV 2009Kyushu University

LSRDP Architecture

Processing Elements FU

implements basic 64-bit double-precision floating point operations including: ADD, SUB and MUL

TU (transfer unit) as a routing resource for transferring datafrom a row to an inconsecutive row

FU TU

FU

TU FU TUTU

FU TUFU

PE including Two components

Four functionalities

Page 15: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

15SSV 2009Kyushu University

Layout Types- Type IW

ORN

ORN

ORN

.

.

.

…A

TM

AT

M

AT

M

AT

M

AT

M

…A

TM

AT

M

AT

M

AT

M

AT

M

…A

TM

AT

M

AT

M

AT

M

AT

M

…A

TM

AT

M

AT

M

AT

M

AT

M

ADD/SUB

MUL

TU

Each PE implements ADD/SUB and MUL

M

A

T

: ADD/SUB

: MUL

: Transfer Unit

H

Flexible but consume a lot of resources

Page 16: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

16SSV 2009Kyushu University

W

ORN

ORN

ORN

.

.

.

…M TA T A T A T M T

…M TA T A T A T M T

…M TA T A T A T M T

…M TA T A T A T M T

Layout Types- Type II (Checkered)

H

Each PE implements ADD/SUB or MUL Each PE implements

ADD/SUB or MUL

ADD/SUB TU MUL TU

Page 17: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

17SSV 2009Kyushu University

W

ORN

ORN

ORN

.

.

.

…M TM T M T M T M T

…A TA T A T A T A T

…M TM T M T M T M T

…A TA T A T A T A T

Layout Types- Type III (Striped)

H

Each PE implements ADD/SUB or MUL

Each PE implements ADD/SUB or MUL

ADD/SUB TU

MUL TU

Type II or III, which one is more efficient?

Page 18: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

18SSV 2009Kyushu University

Maximum Connection Length (MCL)

(i, 0)

(i+1,0)

(i+1,j)

...

...

(i,j)

ORN

...

... ...

...

(i+1,j+L)

Longest ConnectionLength= L

(i,j+2)

(i,j+1)

(i+1,j+2)

(i+1,j+1)

ConnectionLength= 0

ConnectionLength= 2

MCL: maximum horizontal distance between two PEs located in two subsequent rows

Page 19: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

19SSV 2009Kyushu University

An ORN Structure

A. Fujimaki, et al., Demonstration of an SFQ-Based Accelerator Prototype for a High-Performance Computer,” ASC08, 2008.

FPUFPUFPUFPUFPU TTTTT

FPUFPUFPUFPUFPU TT

T

TT

½CB½CB½CB½CB½CB

CB CB CB CBT2 T2

½CB½CB½CB½CB½CB

CB CB CB CBCB

CB CB CB CBCB CB CB CBCBCB

CB CB CB CBT2 T2CB CB CB CBCB

T2 CB T2 CBT2 CB T2 CBCBT2

FPUFPUFPUFPUFPU TTTTT

FPUFPUFPUFPUFPU TT

T

TT

½CB½CB½CB½CB½CB

CB CB CB CBT2 T2

½CB½CB½CB½CB½CB

CB CB CB CBCB

CB CB CB CBCB CB CB CBCBCB

CB CB CB CBT2 T2CB CB CB CBCB

T2 CB T2 CBT2 CB T2 CBCBT2

ORN is consisted of 2-bit shift registers, 1-by-2 and 2-by-2 cross bar switches

FPU

2bit shiftregister

ORN

Page 20: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

20SSV 2009Kyushu University

Dynamic Reconfiguration Mechanism

Execution

wait

Starting ofExecution

End ofExecution

Starting ofReconfiguration

End ofReconfiguration

idle

Reconfiguration

ORN

Immediate

PE

InitialState

Page 21: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

21SSV 2009Kyushu University

Dynamic Reconfiguration Architecture

FU(A op B)

TransferUnit

ImmediateRegister (64b)

ORN

MUX

・・・・・・

ImmediateRegister

・・・・・・

PEInput-AInput-B Input-C

log(2x (2MCL+1)) x 3 [b]

Conf. Reg.[bit]

Three bit-stream lines for dynamic reconfiguration of:• Immediate registers (64bit) in each PE • Selector bits for muxes selecting the input data of FUs• Cross-bar switches in ORNs

Page 22: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

Design Procedure and Tool Chain

Page 23: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

23SSV 2009Kyushu University

Compiler and Design Flow

Application Code

Hardware-Software Partitioning(manual or automatic)

Critical Parts(h/w part)

Non-Critical Parts(s/w part)

Port positioning

s/w Part Modification

Binary Code(for GPP)

Configurations(for LSRDP)

LSRDP Architecture

Placement

Routing

DFG Genration

Bit-Stream Generation

DFGsDFGs

Analyzing DFG mappingresults

Design Phase

Mapping

• DFGs are manually generated from critical parts of applications• DFG mapping results are used for

• Analyzing LSRDP architecture statistics• Generating LSRDP configuration bit-streams

Page 24: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

24SSV 2009Kyushu University

Benchmark Applicationsfor Design Procedures

Finite differential method calculation of2nd order partial differential equations 1dim-Heat equation      (Heat) 1dim-Vibration equation (Vibration) 2dim-Poisson equation (Poisson)

Quantum chemistry application Recursive parts of Electron Repulsion Integral calculation

(ERI-Rec)

Only ADD/SUB and MUL operations are usedin the critical calculations of all above applications

Page 25: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

25SSV 2009Kyushu University

DFG Extraction- Heat Equation

1-dim. heat equation for T(x,t)

             

Calculation by Finite DifferenceMethod (FDM)

2

2

( , ) ( , )T x t T x tA

t x

(A is const.)

T(i,j+1)

T(i-1,j) T(i,j) T(i+1,j)

+

*

*

+

D

B

T(i,j+1)

T(i-1,j) T(i,j) T(i+1,j)

+

*

*

+

D

B

),(),(*),(*

),(

11

1

jijiji

ji

txTtxTBtxTD

txT

Basic DFG corresponding to minimum FDM calculation

Basic DFG can be extended to horizontal and vertical directions to make a larger DFG

Page 26: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

26SSV 2009Kyushu University

Example of extracted DFGs- Heat

Inputs: 32Outputs: 16Operations: 721 Immediates: 364

A huge sample DFG (Heat)

Page 27: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

27SSV 2009Kyushu University

0

200

400

600

800

1000

0 10 20 30 40 50 60 70

DFG Distribution for each application#

of F

Us

# of Inputs

Poisson (3)

Vibration (7)

Heat (6)

ERI-Rec (8 DFGs)

DFGs have different qualities in terms of the # of FUs, # of Inputs and Outputs

24DFGs

Page 28: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

28SSV 2009Kyushu University

DFG ClassificationClass # of FUs

# ofInputs

# ofOutputs

# ofDFGs

Heat (3)Poi (1)Vib (2)Eri (4)

Heat (1)Poi (1)Vib (1)Eri (4)

Heat (2)Poi (1)Vib (2)Eri (5)

Heat (1)Poi (1)Vib (2)Eri (5)

12

12

24

52

19

19

38

64

128

512

1024

> 1024

RDP-S

RDP-M

RDP-L

RDP-XL

Due to broad range of DFG sizesDFGs are classified as S, M, L, XL with respect to their sizeand the number of Input/Output nodes => LSRDP designing processes for S, M, L, XL, respectively

Totally,24 DFGs are preparedfor benchmark Apps.

Page 29: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

29SSV 2009Kyushu University

Mapping DFGs onto LSRDP

Longest connections

Placing DFGnodes on LSRDP

RoutingConnections

Placing IO nodes

Routing Inp/OutConnections

DFG

LSRDPArchitectureDescription

ConfigurationFile

Page 30: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

30SSV 2009Kyushu University

LSRDP Design Procedure

Choosing a design parameter

Mapping DFGs onto the LSRDP

Obtaining required statistics

Choosing the appropriate value

Analyzing the mapping results

For eachparameter

Appropriate values for all parameters

DFGs & LSRDP HW constraints

Page 31: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

Preliminary Results

Page 32: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

32SSV 2009Kyushu University

LSRDP Specifications: Width & Height

# of Input ports

# of Output ports

Width Height

LSRDP-S 19 12 16 16

LSRDP-M 19 12 32 16

LSRDP-L 38 24 64 32

LSRDP Dimensions and the number of input/output ports

Page 33: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

33SSV 2009Kyushu University

LSRDP MCL(avg/max)

ORN Size-No of Inps (avg/max), Outs

LSRDP-S 4/8 18/34, 3

LSRDP-M 5/9 22/38, 3

LSRDP-L 5/9 22/34, 3

LSRDP Specifications: MCL

Further MCL optimization needed

(i, j)

(i+2,j+1)

(i+L,j+1)

(i+1,j+1)

(i,j+1)

MCL = L

・・・

No. of Outputs= 3

10 to 3

FU TFU T FU TFU T FU T FU T FU T

FU TFU T FU TFU T FU T FU T FU T

MCL (Max. Conn. Len.)= 2

No. of Inputs=(2xMCL+1)x2= 10

...

...

...

...

...

Page 34: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

34SSV 2009Kyushu University

Analyzing Various LSRDP Layouts

Layout II can be used instead of Layout Ito obtain a smaller LSRDP

(Except ERI1 DFG which gives better size for Layout III)

Layout SizeI 8x3II 8x3III 8x4I 10x8II 10x8III 10x11I 10x10II 10x12III 15x18I 6x2II 9x3III 6x2I 10x10II 10x10III 15x8

Viration

Poisson

ERI1

ERI2

Heat

Page 35: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

35SSV 2009Kyushu University

LSRDP at One Glance (1/2)

Functional units ADD/SUB, MUL

Layout Type II (checker pattern)

Operations 64-bit floating point

Processing structure Pipelined

PE structure FU, T, FU+T, T+T

LSRDP Size Small Medium Large

No. of inp/out ports 19/12 19/12 38/24

Width/Height 16/16 32/16 64/32

Conf. bit-stream size

Imm. Regs 16*16*64 32*16*64 64*32*64

ORNs 16*BSS(ORN) 32* BSS(ORN) 64*BSS(ORN)

PEs 16*16* 2 32*16*2 64*32* 2

ORN inputs, outputs 22 , 3 26 , 3 26 , 3

Structure Cross-bar switch

Conn. Type One-directional

Page 36: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

36SSV 2009Kyushu University

LSRDP at One Glance (2/2)

Internal memory Type Immediate registers

Size and count 64-bit registers, One reg. for each PE

Communication mechanism Serial

External memory No. of memory modules 16

Date trans. rate 1800Mbps/pin

Overall data trans. rate 24 GB/s

Mem. to LSRDP bus width 64 bit

Channels per module Two

Reconf. mechanism Bit serial configuration through a serial chain

Page 37: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

37SSV 2009Kyushu University

Preliminary Performance Evaluation

Processor type Out-of-order

GPP operating frequency 3.2GHz

Inst. issue width 4 instruction/cc

Inst. decode width 4 instruction/cc

Cache configuration L1 data 64KB(128B Entry, 2way, 2cc)

L1 instruction 64KB(64B Entry, 1way, 1cc)

L2 unified 4MB(128B Entry, 4way, 16cc)

Latency of main memory 300cc

L2 to main memory Bus width 64 Bytes

Freq 800 MHz

LSRDP operating frequency 80 GHz

Reconfiguration Latency 1cc

Latency SPM LSRDP latency 1cc

Latency Main Memory SPM 7500cc

Bandwidth SPMLSRDP Max. 64 * 8 Bytes/cc

Bandwidth Main Memory SPM 102.4GB/sec

Base processor configuration

GPP+LSRDP configuration

GPP : Exec. time measurement by means of a processor simulatorLSRDP : Estimation by performance modeling

Page 38: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

38SSV 2009Kyushu University

Preliminary Performance Evaluation(Heat)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Basic Reuse Basic Reuse

Heat (M) Heat (L)

Nor

mal

ized

by

GPP E

xec.

Tim

e

Reconf.

Comm.

Rearrange

Stall

LSRDP

GPP

Data reusing is employed to avoid the need for data rearrangement as well as frequently data retrieval from the scratchpad memory.

Basic: SB onlyReuse: SB + SPM

Page 39: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

39SSV 2009Kyushu University

Preliminary Performance Evaluation (Poisson)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

poisson(S) poisson(M) poisson(L)

Norm

aliz

ed

by G

PP E

xec. Tim

e

Reconf.

Comm.

Rearrange

Stall

LSRDP

GPP

A small fraction is related to processing time on LSRDP and the main fraction concerns to various overhead times as well as the execution time on GPP

Page 40: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

40SSV 2009Kyushu University

Conclusions & Future Work

A high-performance computer comprising an accelerator (LSRDP) implemented by superconducting circuits was introduced.

24 benchmark Data Flow Graphs (DFGs) were manually generated.

LSRDP micro-architecture is designed based on characteristics of scientific applications via a quantitative approach.

LSRDP is promising for resolving issues originated from CMOS technology as well as achieving considerable performances.

Future Work:

•To achieve higher performance it is required to reduce various overhead costs mainly related to data management part.

•To reduce the implementation cost of LSRDP, we will focus on reducing maximum connection length and ORN size.

Page 41: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

41SSV 2009Kyushu University

Acknowledgement

This research was supportedin part by Core Research for Evolutional Scienceand Technology (CREST) of Japan Scienceand Technology Corporation (JST).

Page 42: Optimizing the Architecture of SFQ-RDP (Single Flux Quantum- Reconfigurable Datapath)

42SSV 2009Kyushu University

2x3 RDP processor prototype

8-bit ALUs implementing: ADD, SUB, AND, OR, XOR

25GHz Frequency 6-bit Data transfer shift registers 16-bit I/O shift registers 21 Pipeline stages 7-bit Data width Area: 6.84 x 6.72 mm2 Total number of Junctions :

14040JJs Bias current : 1.652A

ALU1

SR_IN

ORN1 ALU3 ALU5

ALU6ALU4ALU2

SR_OUT

ORN3 ORN5

ORN2 ORN4 ORN6

ALU Controller

ALU1

SR_IN

ORN1 ALU3 ALU5

ALU6ALU4ALU2

SR_OUT

ORN3 ORN5

ORN2 ORN4 ORN6

ALU Controller

Fujimaki, et al., Demonstration of an SFQ-Based Accelerator Prototype for a High-Performance Computer,” ASC08, 2008


Recommended