+ All Categories
Home > Documents > Future Directions for On-Chip Interconnection Networks

Future Directions for On-Chip Interconnection Networks

Date post: 04-Nov-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
29
OCIN: 1 Dec 7, 2006 Future Directions for On-Chip Interconnection Networks William J. Dally Computer Systems Laboratory Stanford University OCIN Workshop December 7, 2006
Transcript

OCIN: 1 Dec 7, 2006

Future Directions for On-ChipInterconnection Networks

William J. DallyComputer Systems Laboratory

Stanford University

OCIN Workshop

December 7, 2006

OCIN: 2 Dec 7, 2006

State of Off-Chip Networks

OCIN: 3 Dec 7, 2006

0.1

1

10

100

1000

10000

1985 1990 1995 2000 2005 2010

year

ba

nd

wid

th p

er

rou

ter

no

de

(G

b/s

)

Torus Routing ChipIntel iPSC/2J-Machine

CM-5Intel Paragon XPCray T3D

MIT AlewifeIBM VulcanCray T3E

SGI Origin 2000AlphaServer GS320IBM SP Switch2Quadrics QsNet

Cray X1Velio 3003IBM HPS

SGI Altix 3000Cray XT3YARC

BlackWidow

Technology Trends…

OCIN: 4 Dec 7, 2006

Some History

MARS Router

1984

Torus Routing Chip

1985

Network Design Frame

1988

MDP 1991

Reliable Router

1994MAP 1998 Imagine

2002

YARC

2006

Robert Mullins

OCIN: 5 Dec 7, 2006

Some very good books

OCIN: 6 Dec 7, 2006

Summary of Off-Chip Networks*

• Topology

– Fit to packaging and signaling technology

– High-radix - Clos or FlatBfly gives lowest cost

• Routing

– Global adaptive routing balances load w/o destroying locality

• Flow control

– Virtual channels/virtual cut-through

*oversimplified

OCIN: 7 Dec 7, 2006

So, what’s different about on-chipnetworks?

OCIN: 8 Dec 7, 2006

What’s different about OCINs?

• Cost– Off: cost is channels - pins, connectors, cables, optics

– On: cost is storage and switches, wires plentiful

– Drives networks with many wide channels, few buffers

• Low-radix augmented mesh

• Channel Characteristics– On-chip RC lines - need a repeater every 1mm

– Short distance - low latency

– Can put logic in repeaters, motivates low-latency routers

• Workload– CMP cache traffic

– SoC isochronous flows

• Differences motivate some surprising differences from on-chipnetworks

OCIN: 9 Dec 7, 2006

Example CMP OCIN

OCIN: 10 Dec 7, 2006

On-Chip Interconnection Network

System = Processor Tiles

Source: Balfour and Dally, ICS 06

OCIN: 11 Dec 7, 2006

On-Chip Interconnection Network (2)

System = Processor Tiles + Channels

Source: Balfour and Dally, ICS 06

OCIN: 12 Dec 7, 2006

Interconnection Network (3)

System = Processor Tiles + Channels + Routers

Source: Balfour and Dally, ICS 06

OCIN: 13 Dec 7, 2006

Router Architecture

• Input-queued

• Virtual Channel

• Speculative Pipeline

Source: Balfour and Dally, ICS 06

OCIN: 14 Dec 7, 2006

Router Area

Accurate modeling requires floorplan

Source: Balfour and Dally, ICS 06

OCIN: 15 Dec 7, 2006

Architecture very sensitive toelement properties

OCIN: 16 Dec 7, 2006

Enabling Technology is a Prerequisite

Channels,

Buffers,

Switches

Topology

Routing

Flow Control

Microarchitecture

OCIN: 17 Dec 7, 2006

Channels

• 10x to 100x power reduction

• Eq signaling for faster propagation and increased repeater

distance (D & P Chapter 8, Heaton 01)

• Elastic channels provide “free” buffers (Mizuno 01)

• Send 4-8 bits per cycle per wire (assuming 20FO4 cycle)

OCIN: 18 Dec 7, 2006

Buffers

• Dense arrays (vs. Flip-Flops or Latches) 10x area/bit

• Low-swing write

OCIN: 19 Dec 7, 2006

Switches

• Low-swing bit lines

• Operate at channel rate

• Reduces area and hence power

• Equalized drive

• Buffered crosspoints

• Integral allocation

OCIN: 20 Dec 7, 2006

Properties of these elements drivesoptimal network organization

OCIN: 21 Dec 7, 2006

Torus

Source: Balfour and Dally, ICS 06

OCIN: 22 Dec 7, 2006

Concentrated Mesh

Source: Balfour and Dally, ICS 06

OCIN: 23 Dec 7, 2006

Express Links

Source: Balfour and Dally, ICS 06

OCIN: 24 Dec 7, 2006

Network Replication

• Abundant wire resources build second network

– Resource allocation tradeoff

Wide:[+] Serialization Latency[+] Router Energy Efficiency[ Router Area

Replicated:[+] Decoupled Resources[+] Area Efficiency[?] Energy Efficiency[ Serialization Latency

[+] SCALABLE

Source: Balfour and Dally, ICS 06

OCIN: 25 Dec 7, 2006

Energy Efficiency

Network Energy Completion Time(normalized to Torus network)

Source: Balfour and Dally, ICS 06

OCIN: 26 Dec 7, 2006

Large differences in efficiency.

Optimal topology not obvious, notregular and very sensitive to

properties of network elements

OCIN: 27 Dec 7, 2006

Where is Energy Expended?

Source: Balfour and Dally, ICS 06

OCIN: 28 Dec 7, 2006

A Research Agenda

1. Develop efficient network elements

• Channels, buffers, switches, allocators

• Opportunities for >10x improvements in efficiency

• Enabling technology

2. Capture workloads representative of CMPs and SoCs

3. Develop optimal topologies for 1 and 2

4. Develop efficient routing and flow-control methods

• Load-balanced routing

• Buffer-efficient flow control

5. Develop efficient router microarchitectures

• Single cycle, area efficient

6. Prototype to test assumptions

7. Iterate

OCIN: 29 Dec 7, 2006

Summary

• OCNs critically important– Emerging CMPs, SoCs

• Very different than off-chip networks– Cost - largely area

– Channel properties

• OCIN design very sensitive to implementation– Need floorplans, accurate estimates

• Efficient network elements are enabling technology– Change the equation for network design

• Optimal design– Concentration, replicated networks

• Many research opportunities


Recommended