+ All Categories
Home > Documents > CS252 Graduate Computer Architecture Lecture 21 Multiprocessor Networks (con’t)

CS252 Graduate Computer Architecture Lecture 21 Multiprocessor Networks (con’t)

Date post: 30-Jan-2016
Category:
Upload: keiji
View: 43 times
Download: 0 times
Share this document with a friend
Description:
CS252 Graduate Computer Architecture Lecture 21 Multiprocessor Networks (con’t). John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252. Review: On Chip: Embeddings in two dimensions. - PowerPoint PPT Presentation
Popular Tags:
45
CS252 Graduate Computer Architecture Lecture 21 Multiprocessor Networks (con’t) John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/ cs252
Transcript
  • CS252Graduate Computer ArchitectureLecture 21

    Multiprocessor Networks (cont)John KubiatowiczElectrical Engineering and Computer SciencesUniversity of California, Berkeley

    http://www.eecs.berkeley.edu/~kubitron/cs252

    cs252-S09, Lecture 21

  • Review: On Chip: Embeddings in two dimensionsEmbed multiple logical dimension in one physical dimension using long wiresWhen embedding higher-dimension in lower one, either some wires longer than others, or all wires long6 x 3 x 2

    cs252-S09, Lecture 21

  • Review:Store&Forward vs Cut-Through RoutingTime:h(n/b + D/) vsn/b + h D/ OR(cycles): h(n/w + D) vsn/w + h D

    what if message is fragmented?wormhole vs virtual cut-through

    cs252-S09, Lecture 21

  • ContentionTwo packets trying to use the same link at same timelimited bufferingdrop?Most parallel mach. networks block in placelink-level flow controltree saturationClosed system - offered load depends on deliveredSource Squelching

    cs252-S09, Lecture 21

  • BandwidthWhat affects local bandwidth?packet densityb x ndata/nrouting delayb x ndata /(n + wD)contentionendpointswithin the networkAggregate bandwidthbisection bandwidthsum of bandwidth of smallest set of links that partition the networktotal bandwidth of all the channels: Cbsuppose N hosts issue packet every M cycles with ave dist each msg occupies h channels for l = n/w cycles eachC/N channels available per nodelink utilization for store-and-forward: r = (hl/M channel cycles/node)/(C/N) = Nhl/MC < 1!link utilization for wormhole routing?

    cs252-S09, Lecture 21

  • Saturation

    cs252-S09, Lecture 21

    Chart1

    1.3333333333

    1.4137931034

    1.5

    1.5925925926

    1.6923076923

    1.8

    1.9166666667

    2.0434782609

    2.1818181818

    2.3333333333

    2.5

    2.6842105263

    2.8888888889

    3.1176470588

    3.375

    3.6666666667

    4

    4.3846153846

    4.8333333333

    5.3636363636

    6

    6.7777777778

    7.75

    9

    10.6666666667

    13

    16.5

    22.3333333333

    34

    69

    0.7

    0.72

    0.74

    0.76

    0.78

    0.8

    0.82

    0.84

    0.86

    0.88

    0.9

    0.92

    0.94

    0.96

    0.98

    Saturation

    Delivered Bandwidth

    Latency

    Sheet1

    0.7

    K2K

    pp/(a-p)pBW

    0.11.33333333330.10.1

    0.121.41379310340.120.12

    0.141.50.140.14

    0.161.59259259260.160.16

    0.181.69230769230.180.18

    0.21.80.20.2

    0.221.91666666670.220.22

    0.242.04347826090.240.24

    0.262.18181818180.260.26

    0.282.33333333330.280.28

    0.32.50.30.3

    0.322.68421052630.320.32

    0.342.88888888890.340.34

    0.363.11764705880.360.36

    0.383.3750.380.38

    0.43.66666666670.40.4

    0.4240.420.42

    0.444.38461538460.440.44

    0.464.83333333330.460.46

    0.485.36363636360.480.48

    0.560.50.5

    0.526.77777777780.520.52

    0.547.750.540.54

    0.5690.560.56

    0.5810.66666666670.580.58

    0.6130.60.6

    0.6216.50.620.62

    0.6422.33333333330.640.64

    0.66340.660.66

    0.68690.680.68

    0.70.70.7

    0.720.720.7

    0.740.740.7

    0.760.760.7

    0.780.780.7

    0.80.80.7

    0.820.820.7

    0.840.840.7

    0.860.860.7

    0.880.880.7

    0.90.90.7

    0.920.920.7

    0.940.940.7

    0.960.960.7

    0.980.980.7

    110.7

    &A

    Page &P

    Sheet1

    &A

    Page &P

    Saturation

    Delivered Bandwidth

    Latency

    Sheet2

    &A

    Page &P

    Saturation

    Offered Bandwidth

    Delivered Bandwidth

    Sheet3

    &A

    Page &P

    Sheet4

    &A

    Page &P

    Sheet5

    &A

    Page &P

    Sheet6

    &A

    Page &P

    Sheet7

    &A

    Page &P

    Sheet8

    &A

    Page &P

    Sheet9

    &A

    Page &P

    Sheet10

    &A

    Page &P

    Sheet11

    &A

    Page &P

    Sheet12

    &A

    Page &P

    Sheet13

    &A

    Page &P

    Sheet14

    &A

    Page &P

    Sheet15

    &A

    Page &P

    Sheet16

    &A

    Page &P

    &A

    Page &P

    Chart2

    0.1

    0.12

    0.14

    0.16

    0.18

    0.2

    0.22

    0.24

    0.26

    0.28

    0.3

    0.32

    0.34

    0.36

    0.38

    0.4

    0.42

    0.44

    0.46

    0.48

    0.5

    0.52

    0.54

    0.56

    0.58

    0.6

    0.62

    0.64

    0.66

    0.68

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    0.7

    Saturation

    Offered Bandwidth

    Delivered Bandwidth

    Sheet1

    0.7

    K2K

    pp/(a-p)pBW

    0.11.33333333330.10.1

    0.121.41379310340.120.12

    0.141.50.140.14

    0.161.59259259260.160.16

    0.181.69230769230.180.18

    0.21.80.20.2

    0.221.91666666670.220.22

    0.242.04347826090.240.24

    0.262.18181818180.260.26

    0.282.33333333330.280.28

    0.32.50.30.3

    0.322.68421052630.320.32

    0.342.88888888890.340.34

    0.363.11764705880.360.36

    0.383.3750.380.38

    0.43.66666666670.40.4

    0.4240.420.42

    0.444.38461538460.440.44

    0.464.83333333330.460.46

    0.485.36363636360.480.48

    0.560.50.5

    0.526.77777777780.520.52

    0.547.750.540.54

    0.5690.560.56

    0.5810.66666666670.580.58

    0.6130.60.6

    0.6216.50.620.62

    0.6422.33333333330.640.64

    0.66340.660.66

    0.68690.680.68

    0.70.70.7

    0.720.720.7

    0.740.740.7

    0.760.760.7

    0.780.780.7

    0.80.80.7

    0.820.820.7

    0.840.840.7

    0.860.860.7

    0.880.880.7

    0.90.90.7

    0.920.920.7

    0.940.940.7

    0.960.960.7

    0.980.980.7

    110.7

    &A

    Page &P

    Sheet1

    &A

    Page &P

    Saturation

    Delivered Bandwidth

    Latency

    Sheet2

    &A

    Page &P

    Saturation

    Offered Bandwidth

    Delivered Bandwidth

    Sheet3

    &A

    Page &P

    Sheet4

    &A

    Page &P

    Sheet5

    &A

    Page &P

    Sheet6

    &A

    Page &P

    Sheet7

    &A

    Page &P

    Sheet8

    &A

    Page &P

    Sheet9

    &A

    Page &P

    Sheet10

    &A

    Page &P

    Sheet11

    &A

    Page &P

    Sheet12

    &A

    Page &P

    Sheet13

    &A

    Page &P

    Sheet14

    &A

    Page &P

    Sheet15

    &A

    Page &P

    Sheet16

    &A

    Page &P

    &A

    Page &P

  • How Many Dimensions?n = 2 or n = 3Short wires, easy to buildMany hops, low bisection bandwidthRequires traffic localityn >= 4Harder to build, more wires, longer average lengthFewer hops, better bisection bandwidthCan handle non-local traffic

    k-ary d-cubes provide a consistent framework for comparisonN = kdscale dimension (d) or nodes per dimension (k)assume cut-through

    cs252-S09, Lecture 21

  • Traditional Scaling: Latency scaling with NAssumes equal channel widthindependent of node count or dimensiondominated by average distance

    cs252-S09, Lecture 21

    Chart1

    4342.2797631497424240

    44.656854249543.262203155942.7568284642.540

    4744.543.65685424954340

    50.31370849946.059526299444.72717132243.540

    5548.0244063118464440

    61.62741699850.547.5136569244.540

    7153.619052598749.3137084994540

    84.254833995957.548812623651.454342644145.540

    10362.5544640

    129.509667991968.738105197557.0273138446.540

    d=2

    d=3

    d=4

    k=2

    n/w

    Machine Size (N)

    Ave Latency T(n=40)

    Sheet1

    KD

    N234Log

    1642.519842099824

    325.65685424953.17480210392.378414235

    64842.82842712476

    12811.3137084995.03968419963.3635856617

    256166.349604207948

    51222.62741699884.756828469

    10243210.07936839925.656854249510

    204845.254833995912.69920841576.72717132211

    40966416812

    819290.509667991920.15873679839.5136569213

    1638412825.398416831511.31370849914

    32768181.01933598383213.454342644115

    6553625640.31747359661616

    131072362.038671967550.79683366319.0273138417

    2621445126422.62741699818

    524288724.07734393580.634947193326.908685288119

    10485761024101.5936673263220

    Have

    N234Log

    1632.279763149722

    324.65685424953.26220315592.756828462.5

    6474.53.65685424953

    12810.3137084996.05952629944.7271713223.5

    256158.024406311864

    51221.62741699810.57.513656924.5

    10243113.61905259879.3137084995

    204844.254833995917.548812623611.45434264415.5

    40966322.5146

    819289.509667991928.738105197517.027313846.5

    1638412736.597625247220.6274169987

    32768180.019335983846.524.90868528817.5

    6553625558.976210395308

    131072361.038671967574.695250494536.05462768018.5

    26214451194.543.25483399599

    524288723.077343935119.452420789951.81737057629.5

    10485761023150.89050098896210

    RL

    140

    n/w234

    n/wd=2d=3d=4k=2

    16404342.27976314974242

    324044.656854249543.262203155942.7568284642.5

    64404744.543.656854249543

    1284050.31370849946.059526299444.72717132243.5

    256405548.02440631184644

    5124061.62741699850.547.5136569244.5

    1024407153.619052598749.31370849945

    20484084.254833995957.548812623651.454342644145.5

    40964010362.55446

    819240129.509667991968.738105197557.0273138446.5

    163844016776.597625247260.62741699847

    3276840220.019335983886.564.908685288147.5

    655364029598.9762103957048

    13107240401.0386719675114.695250494576.054627680148.5

    26214440551134.583.254833995949

    52428840763.077343935159.452420789991.817370576249.5

    1048576401063190.890500988910250

    RL

    1140

    234Log

    n/wd=2d=3d=4k=2

    16140143142.2797631497142142

    32140144.6568542495143.2622031559142.75682846142.5

    64140147144.5143.6568542495143

    128140150.313708499146.0595262994144.727171322143.5

    256140155148.0244063118146144

    512140161.627416998150.5147.51365692144.5

    1024140171153.6190525987149.313708499145

    2048140184.2548339959157.5488126236151.4543426441145.5

    4096140203162.5154146

    8192140229.5096679919168.7381051975157.02731384146.5

    16384140267176.5976252472160.627416998147

    32768140320.0193359838186.5164.9086852881147.5

    65536140395198.976210395170148

    131072140501.0386719675214.6952504945176.0546276801148.5

    262144140651234.5183.2548339959149

    524288140863.077343935259.4524207899191.8173705762149.5

    10485761401163290.8905009889202150

    &A

    Page &P

    Sheet1

    &A

    Page &P

    d=2

    d=3

    d=4

    k=2

    n/w

    Machine Size (N)

    Ave Latency T(n=40)

    Sheet2

    &A

    Page &P

    d=2

    d=3

    d=4

    k=2

    n/w

    Machine Size (N)

    Ave Latency T(n=140)

    Sheet3

    &A

    Page &P

    Sheet4

    &A

    Page &P

    Sheet5

    &A

    Page &P

    Sheet6

    &A

    Page &P

    Sheet7

    &A

    Page &P

    Sheet8

    &A

    Page &P

    Sheet9

    &A

    Page &P

    Sheet10

    &A

    Page &P

    Sheet11

    &A

    Page &P

    Sheet12

    &A

    Page &P

    Sheet13

    &A

    Page &P

    Sheet14

    &A

    Page &P

    Sheet15

    &A

    Page &P

    Sheet16

    &A

    Page &P

    &A

    Page &P

    Chart2

    143142.2797631497142142140

    144.6568542495143.2622031559142.75682846142.5140

    147144.5143.6568542495143140

    150.313708499146.0595262994144.727171322143.5140

    155148.0244063118146144140

    161.627416998150.5147.51365692144.5140

    171153.6190525987149.313708499145140

    184.2548339959157.5488126236151.4543426441145.5140

    203162.5154146140

    229.5096679919168.7381051975157.02731384146.5140

    d=2

    d=3

    d=4

    k=2

    n/w

    Machine Size (N)

    Ave Latency T(n=140)

    Sheet1

    KD

    N234Log

    1642.519842099824

    325.65685424953.17480210392.378414235

    64842.82842712476

    12811.3137084995.03968419963.3635856617

    256166.349604207948

    51222.62741699884.756828469

    10243210.07936839925.656854249510

    204845.254833995912.69920841576.72717132211

    40966416812

    819290.509667991920.15873679839.5136569213

    1638412825.398416831511.31370849914

    32768181.01933598383213.454342644115

    6553625640.31747359661616

    131072362.038671967550.79683366319.0273138417

    2621445126422.62741699818

    524288724.07734393580.634947193326.908685288119

    10485761024101.5936673263220

    Have

    N234Log

    1632.279763149722

    324.65685424953.26220315592.756828462.5

    6474.53.65685424953

    12810.3137084996.05952629944.7271713223.5

    256158.024406311864

    51221.62741699810.57.513656924.5

    10243113.61905259879.3137084995

    204844.254833995917.548812623611.45434264415.5

    40966322.5146

    819289.509667991928.738105197517.027313846.5

    1638412736.597625247220.6274169987

    32768180.019335983846.524.90868528817.5

    6553625558.976210395308

    131072361.038671967574.695250494536.05462768018.5

    26214451194.543.25483399599

    524288723.077343935119.452420789951.81737057629.5

    10485761023150.89050098896210

    RL

    140

    n/w234

    n/wd=2d=3d=4k=2

    16404342.27976314974242

    324044.656854249543.262203155942.7568284642.5

    64404744.543.656854249543

    1284050.31370849946.059526299444.72717132243.5

    256405548.02440631184644

    5124061.62741699850.547.5136569244.5

    1024407153.619052598749.31370849945

    20484084.254833995957.548812623651.454342644145.5

    40964010362.55446

    819240129.509667991968.738105197557.0273138446.5

    163844016776.597625247260.62741699847

    3276840220.019335983886.564.908685288147.5

    655364029598.9762103957048

    13107240401.0386719675114.695250494576.054627680148.5

    26214440551134.583.254833995949

    52428840763.077343935159.452420789991.817370576249.5

    1048576401063190.890500988910250

    RL

    1140

    234Log

    n/wd=2d=3d=4k=2

    16140143142.2797631497142142

    32140144.6568542495143.2622031559142.75682846142.5

    64140147144.5143.6568542495143

    128140150.313708499146.0595262994144.727171322143.5

    256140155148.0244063118146144

    512140161.627416998150.5147.51365692144.5

    1024140171153.6190525987149.313708499145

    2048140184.2548339959157.5488126236151.4543426441145.5

    4096140203162.5154146

    8192140229.5096679919168.7381051975157.02731384146.5

    16384140267176.5976252472160.627416998147

    32768140320.0193359838186.5164.9086852881147.5

    65536140395198.976210395170148

    131072140501.0386719675214.6952504945176.0546276801148.5

    262144140651234.5183.2548339959149

    524288140863.077343935259.4524207899191.8173705762149.5

    10485761401163290.8905009889202150

    &A

    Page &P

    Sheet1

    &A

    Page &P

    d=2

    d=3

    d=4

    k=2

    n/w

    Machine Size (N)

    Ave Latency T(n=40)

    Sheet2

    &A

    Page &P

    d=2

    d=3

    d=4

    k=2

    n/w

    Machine Size (N)

    Ave Latency T(n=140)

    Sheet3

    &A

    Page &P

    Sheet4

    &A

    Page &P

    Sheet5

    &A

    Page &P

    Sheet6

    &A

    Page &P

    Sheet7

    &A

    Page &P

    Sheet8

    &A

    Page &P

    Sheet9

    &A

    Page &P

    Sheet10

    &A

    Page &P

    Sheet11

    &A

    Page &P

    Sheet12

    &A

    Page &P

    Sheet13

    &A

    Page &P

    Sheet14

    &A

    Page &P

    Sheet15

    &A

    Page &P

    Sheet16

    &A

    Page &P

    &A

    Page &P

  • Average Distancebut, equal channel width is not equal cost!Higher dimension => more channelsave dist = d (k-1)/2

    cs252-S09, Lecture 21

    Chart1

    15311271023

    8.024406311813.619052598736.5976252472150.8905009889

    69.31370849920.62741699862

    5.07858283267.514.911011265937.5

    4.55952629946.524406311812.119052598727.2381051975

    4.22862659575.921301348410.521.8602625994

    45.513656929.454342644118.627416998

    95.220537658.727610430516.4975227124

    1058.195079107715

    11117.788983888413.8950076099

    12127.469544579713.0488126236

    13137.211958994312.3814971295

    1414711.8426026969

    15151511.3988157484

    16161611.02731384

    17171710.7119683183

    18181810.4410753001

    19191910.2059458445

    20202010

    256

    1024

    16384

    1048576

    Dimension

    Ave Distance

    Sheet1

    Latency with Fixed

    Hops per dimAve Hops

    Ndkkd = (k-1)/2h = d*kdT(n) = n/w + hr

    2562167.515

    25636.34960420792.67480210398.0244063118

    256441.56

    25653.0314331331.01571656655.0785828326

    25662.51984209980.75992104994.5595262994

    25672.20817902730.60408951374.2286265957

    256820.54

    Ndkkd = (k-1)/2h = d*kd

    102423215.531

    1024310.07936839924.539684199613.6190525987

    102445.65685424952.32842712479.313708499

    1024541.57.5

    102463.17480210391.0874010526.5244063118

    102472.69180038530.84590019265.9213013484

    102482.378414230.6892071155.51365692

    102492.16011947780.58005973895.22053765

    10241020.55

    Ndkkd = (k-1)/2h = d*kd

    16384212863.5127

    16384325.398416831512.199208415736.5976252472

    16384411.3137084995.156854249520.627416998

    1638456.96440450642.982202253214.9110112659

    1638465.03968419962.019842099812.1190525987

    16384741.510.5

    1638483.3635856611.18179283059.4543426441

    1638492.93946898460.96973449238.7276104305

    16384102.63901582150.81950791088.1950791077

    16384112.41617888880.70808944447.7889838884

    16384122.24492409660.62246204837.4695445797

    16384132.1095321530.55476607657.2119589943

    163841420.57

    Ndkkd = (k-1)/2h = d*kd

    104857621024511.51023

    10485763101.59366732650.296833663150.8905009889

    104857643215.562

    10485765167.537.5

    1048576610.07936839924.539684199627.2381051975

    104857677.24578931413.122894657121.8602625994

    104857685.65685424952.328427124718.627416998

    104857694.66611615831.833058079216.4975227124

    10485761041.515

    1048576113.526365021.2631825113.8950076099

    1048576123.17480210391.08740105213.0488126236

    1048576132.90484571220.952422856112.3814971295

    1048576142.69180038530.845900192611.8426026969

    1048576152.51984209980.759921049911.3988157484

    1048576162.378414230.68920711511.02731384

    1048576172.26023156690.630115783410.7119683183

    1048576182.16011947780.580059738910.4410753001

    1048576192.07431008890.537155044410.2059458445

    10485762020.510

    Routing Delay for fixed R

    &A

    Page &P

    ave dist

    Routing Delay for Fixed R

    N

    d2561024163841048576

    215311271023

    38.024406311813.619052598736.5976252472150.8905009889

    469.31370849920.62741699862

    55.07858283267.514.911011265937.5

    64.55952629946.524406311812.119052598727.2381051975

    74.22862659575.921301348410.521.8602625994

    845.513656929.454342644118.627416998

    95.220537658.727610430516.4975227124

    1058.195079107715

    117.788983888413.8950076099

    127.469544579713.0488126236

    137.211958994312.3814971295

    14711.8426026969

    1511.3988157484

    1611.02731384

    1710.7119683183

    1810.4410753001

    1910.2059458445

    2010

    &A

    Page &P

    ave dist

    &A

    Page &P

    256

    1024

    16384

    1048576

    Dimension

    Ave Distance

    fig7-14-equal-node

    Latency with fixed channel width

    L40320 b140

    R22

    NN

    d2561024163841048576d2561024163841048576

    270102294208621702023942186

    356.048812623667.2381051975113.1952504945341.78100197793156.0488126236167.2381051975213.1952504945441.7810019779

    45258.62741699881.25483399591644152158.627416998181.2548339959264

    550.15716566515569.82202253181155150.1571656651155169.8220225318215

    649.119052598753.048812623664.238105197594.4762103956149.1190525987153.0488126236164.2381051975194.476210395

    748.457253191451.84260269696183.72052519887148.4572531914151.8426026969161183.7205251988

    84851.0273138458.908685288177.25483399598148151.02731384158.9086852881177.2548339959

    950.441075300157.45522086172.99504542479150.4410753001157.455220861172.9950454247

    105056.39015821557010150156.3901582155170

    1155.577967776967.790015219811155.5779677769167.7900152198

    1254.939089159466.097625247212154.9390891594166.0976252472

    1354.423917988564.76299425913154.4239179885164.762994259

    145463.685205393714154163.6852053937

    1562.797631496815162.7976314968

    1662.054627680116162.0546276801

    1761.423936636617161.4239366366

    1860.882150600118160.8821506001

    1960.41189168919160.411891689

    206020160

    &A

    Page &P

    fig7-14-equal-node

    &A

    Page &P

    256

    1024

    16384

    1048576

    Dimension

    Average Latency (n = 40, D = 2)

    fig-10-15-equal-pin

    Latency with fixed Pin count

    L320

    R2

    out pins64

    N

    w=64/6d2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    32240722642056

    21.3333333333331.048812623642.238105197588.1952504945316.7810019779

    1643238.62741699861.2548339959144

    12.8535.15716566514054.8220225318100

    10.6666666667639.119052598743.048812623654.238105197584.476210395

    9.1428571429743.457253191446.84260269695678.7205251988

    884851.0273138458.908685288177.2548339959

    7.1111111111955.441075300162.45522086177.9950454247

    6.4106066.390158215580

    5.81818181821170.577967776982.7900152198

    5.33333333331274.939089159486.0976252472

    4.92307692311379.423917988589.762994259

    4.5714285714148493.6852053937

    4.26666666671597.7976314968

    416102.0546276801

    3.764705882417106.4239366366

    3.555555555618110.8821506001

    3.368421052619115.411891689

    3.220120

    &A

    Page &P

    fig-10-15-equal-pin

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n=40B)

    fig-10-15 (2)

    Latency with fixed Pin count

    L1120

    R2

    out pins64

    N

    wd2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    32265972892081

    21.3333333333368.548812623679.7381051975125.6952504945354.2810019779

    1648288.627416998111.2548339959194

    12.8597.6571656651102.5117.3220225318162.5

    10.66666666676114.1190525987118.0488126236129.2381051975159.476210395

    9.14285714297130.9572531914134.3426026969143.5166.2205251988

    88148151.02731384158.9086852881177.2548339959

    7.11111111119167.9410753001174.955220861190.4950454247

    6.410185191.3901582155205

    5.818181818211208.0779677769220.2900152198

    5.333333333312224.9390891594236.0976252472

    4.923076923113241.9239179885252.262994259

    4.571428571414259268.6852053937

    4.266666666715285.2976314968

    416302.0546276801

    3.764705882417318.9239366366

    3.555555555618335.8821506001

    3.368421052619352.911891689

    3.220370

    &A

    Page &P

    fig-10-15 (2)

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n= 140 B)

    fig-10-16-wire

    Latency with fixed bisection

    L640

    R2

    out pins64

    N

    w = k/2d25610241638410485762561024163841048576

    W=k/2

    256 nodes1024 nodes16 k nodes1M nodes

    21101022642047.2581664512

    3217.6361806068154.2301893549123.5920924903314.38021247683.17480210395.039684199612.699208415750.796833663

    4332244.9015869777154.391918985816422.82842712475.656854249516

    5432.3996971124335213.61375933141551.515716566523.48220225328

    6517.0873892286416.22354859278.2222735124181.46829455241.25992104991.5874010522.51984209985.0396841996

    7588.1203983203487.3607752207341220.37484738661.10408951371.345900192623.6228946571

    8648549.2010196024399.454962089263.529003975611.1892071151.68179283052.8284271247

    9603.0008911639452.9080209168307.31311993771.08005973891.46973449232.3330580792

    10650501.419459498835011.31950791082

    11545.3400669032390.77002232241.20808944441.76318251

    12585.1142687692429.27236121361.12246204831.587401052

    13621.1935271656465.40601864111.05476607651.4524228561

    14654499.203377917511.3459001926

    15530.76596812671.2599210499

    16560.22833344251.189207115

    17587.73759173651.1301157834

    18613.4419664641.0800597389

    19637.48453037221.0371550444

    206601

    &A

    Page &P

    fig-10-16-wire

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n=40)

    fig-10-17

    Latency with fixed Pin count

    L1120

    R20

    out pins64

    N

    wd2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    322335655257520495

    21.33333333333212.9881262362324.8810519748784.45250494473070.3100197789

    164190256.2741699797482.54833995941310

    12.85189.071656651237.5385.7202253184837.5

    10.66666666676196.1905259874235.4881262362347.3810519748649.7621039495

    9.14285714297207.0725319143240.9260269685332.5559.7052519878

    88220250.2731384004329.0868528812512.5483399594

    7.11111111119261.9107530006332.0522086096487.4504542474

    6.410275338.9015821546475

    5.818181818211348.279677769470.4001521982

    5.333333333312359.3908915942470.9762524724

    4.923076923113371.7391798852475.1299425897

    4.571428571414385481.8520539371

    4.266666666715490.4763149685

    416500.5462768009

    3.764705882417511.739366366

    3.555555555618523.8215060012

    3.368421052619536.6189168896

    3.220550

    &A

    Page &P

    fig-10-17

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n= 140 B)

    fig-10-18&19

    Saturation with load

    N1000

    k10

    d3

    R2

    N1000

    Kd4.5flits/c0.22222222220.064516129

    wait0.2304526749N10001024

    have13.5k1032

    d32

    n/w481640

    Base Lat31354367667078102

    rhorho/(1-rho)Wait factorn+h(r+q*n)n+h(r+q*n)n+h(r+q*n)n+h(r+q*n)

    n/wRhon4,d3,k10n8,d3,k10n16,d3,k10n40,d3,k10n4,d2,k32n8,d2,k32n16,d2,k32n40,d2,k32

    0.050.05263157890.01212908820.0531.654970760236.309941520545.619883040973.549707602366.590831918571.18166383780.363327674107.9083191851

    0.10.11111111110.02560585280.132.382716049437.765432098848.530864197580.827160493867.24731182872.494623655982.9892473118114.4731182796

    0.150.17647058820.04066811910.1533.196078431439.392156862751.784313725588.960784313767.981024667973.962049335985.9240986717121.8102466793

    0.20.250.05761316870.234.111111111141.222222222255.444444444498.111111111168.806451612975.612903225889.2258064516130.064516129

    0.250.33333333330.07681755830.2535.148148148143.296296296359.5925925926108.481481481569.741935483977.483870967792.9677419355139.4193548387

    0.30.42857142860.09876543210.336.333333333345.666666666764.3333333333120.333333333370.811059907879.622119815797.2442396313150.1105990783

    0.350.53846153850.12408990190.3537.700854700948.401709401769.8034188034134.008547008572.044665012482.0893300248102.1786600496162.4466501241

    0.40.66666666670.15363511660.439.296296296351.592592592676.1851851852149.96296296373.483870967784.9677419355107.935483871176.8387096774

    0.450.81818181820.18855218860.4541.181818181855.363636363683.7272727273168.818181818275.184750733188.3695014663114.7390029326193.8475073314

    0.510.23045267490.543.444444444459.888888888992.7777777778191.444444444477.225806451692.4516129032122.9032258065214.2580645161

    0.551.22222222220.28166438040.5546.209876543265.4197530864103.8395061728219.098765432179.720430107597.4408602151132.8817204301239.2043010753

    0.61.50.34567901230.649.666666666772.3333333333117.6666666667253.666666666782.8387096774103.6774193548145.3548387097270.3870967742

    0.651.85714285710.42798353910.6554.111111111181.2222222222135.4444444444298.111111111186.8479262673111.6958525346161.3917050691310.4792626728

    0.72.33333333330.53772290810.760.03703703793.0740740741159.1481481481357.370370370492.1935483871122.3870967742182.7741935484363.935483871

    0.7530.69135802470.7568.3333333333109.6666666667192.3333333333440.333333333399.6774193548137.3548387097212.7096774194438.7741935484

    0.840.92181069960.880.7777777778134.5555555556242.1111111111564.7777777778110.9032258065159.8064516129257.6129032258551.0322580645

    0.855.66666666671.30589849110.85101.5185185185176.037037037325.0740740741772.1851851852129.6129032258197.2258064516332.4516129032738.1290322581

    0.992.07407407410.91432594911187167.0322580645272.064516129482.12903225811112.3225806452

    0.95194.3786008230.95267.4444444444507.8888888889988.77777777782431.4444444445279.2903225806496.5806451613931.16129032262234.9032258065

    0.96245.53086419750.96329.6666666667632.33333333331237.66666666673053.6666666667335.4193548387608.83870967741155.67741935482796.1935483871

    0.9732.33333333337.4513031550.97433.3703703704839.74074074071652.48148148154090.7037037038428.9677419355795.9354838711529.8709677423731.6774193549

    0.984911.292181070.98640.77777777781254.55555555562482.11111111126164.7777777779616.0645161291170.12903225812278.25806451625602.6451612904

    0.999922.81481481480.9912632499.00000000014971.000000000212387.00000000041177.35483870972292.70967741944523.419354838811215.5483870971

    N1024

    k32

    d2

    R2

    msg per cycle per node0.064516129

    &A

    Page &P

    fig-10-18&19

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    n40,d3,k10

    n16,d3,k10

    n8,d3,k10

    n4,d3,k10

    Ave Channel Utilization

    Latency

    Sheet6 (2)

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    &A

    Page &P

    n40,d2,k32

    n40,d3,k10

    n16,d2,k32

    n16,d3,k10

    n8,d2,k32

    n8,d3,k10

    n4,d2,k32

    n4,d3,k10

    Channel Utilization

    Latency

    Sheet7

    Saturation with load

    N1024

    k32

    d2

    R2

    N1024

    Kd15.5

    wait0.0905306972

    have31

    n/w481640

    Base Lat667078102

    rhorho/(1-rho)Wait factorn+h(r+q*n)n+h(r+q*n)n+h(r+q*n)n+h(r+q*n)

    n/wRhon/w = 4n/w=8n/w=16n/w=40

    0.050.05263157890.00476477350.0566.590831918571.18166383780.363327674107.9083191851

    0.10.11111111110.01005896640.167.24731182872.494623655982.9892473118114.4731182796

    0.150.17647058820.01597600540.1567.981024667973.962049335985.9240986717121.8102466793

    0.20.250.02263267430.268.806451612975.612903225889.2258064516130.064516129

    0.250.33333333330.03017689910.2569.741935483977.483870967792.9677419355139.4193548387

    0.30.42857142860.03879887020.370.811059907879.622119815797.2442396313150.1105990783

    0.350.53846153850.04874729850.3572.044665012482.0893300248102.1786600496162.4466501241

    0.40.66666666670.06035379810.473.483870967784.9677419355107.935483871176.8387096774

    0.450.81818181820.07407057040.4575.184750733188.3695014663114.7390029326193.8475073314

    0.510.09053069720.577.225806451692.4516129032122.9032258065214.2580645161

    0.551.22222222220.11064862990.5579.720430107597.4408602151132.8817204301239.2043010753

    0.61.50.13579604580.682.8387096774103.6774193548145.3548387097270.3870967742

    0.651.85714285710.16812843760.6586.8479262673111.6958525346161.3917050691310.4792626728

    0.72.33333333330.21123829340.792.1935483871122.3870967742182.7741935484363.935483871

    0.7530.27159209160.7599.6774193548137.3548387097212.7096774194438.7741935484

    0.840.36212278880.8110.9032258065159.8064516129257.6129032258551.0322580645

    0.855.66666666670.51300728410.85129.6129032258197.2258064516332.4516129032738.1290322581

    0.990.81477627470.9167.0322580645272.064516129482.12903225811112.3225806452

    0.95191.72008324660.95279.2903225806496.5806451613931.16129032262234.9032258065

    0.96242.17273673260.96335.4193548387608.83870967741155.67741935482796.1935483871

    0.9732.33333333332.92715920920.97428.9677419355795.9354838711529.8709677423731.6774193549

    0.98494.43600416230.98616.0645161291170.12903225812278.25806451625602.6451612904

    0.99998.96253902190.991177.35483870972292.70967741944523.419354838811215.5483870971

    N1000

    k10

    d3

    R2

    flits per cycle per node0.2222222222

    &A

    Page &P

    Sheet7

    &A

    Page &P

    n/w=40

    n/w=16

    n/w=8

    n/w = 4

    Ave Channel Utilization

    Latency

    Sheet8

    Latency for given offered load

    1032

    rhoflitsn8, d3, k10Flitsn8, d2, k32

    0.050.011111111136.30994152050.003225806571.181663837

    0.10.022222222237.76543209880.006451612972.4946236559

    0.150.033333333339.39215686270.009677419473.9620493359

    0.20.044444444441.22222222220.012903225875.6129032258

    0.250.055555555643.29629629630.016129032377.4838709677

    0.30.066666666745.66666666670.019354838779.6221198157

    0.350.077777777848.40170940170.022580645282.0893300248

    0.40.088888888951.59259259260.025806451684.9677419355

    0.450.155.36363636360.029032258188.3695014663

    0.50.111111111159.88888888890.032258064592.4516129032

    0.550.122222222265.41975308640.03548387197.4408602151

    0.60.133333333372.33333333330.0387096774103.6774193548

    0.650.144444444481.22222222220.0419354839111.6958525346

    0.70.155555555693.07407407410.0451612903122.3870967742

    0.750.1666666667109.66666666670.0483870968137.3548387097

    0.80.1777777778134.55555555560.0516129032159.8064516129

    0.850.1888888889176.0370370370.0548387097197.2258064516

    0.90.22590.0580645161272.064516129

    0.950.2111111111507.88888888890.0612903226496.5806451613

    0.960.2133333333632.33333333330.0619354839608.8387096774

    0.970.2155555556839.74074074070.0625806452795.935483871

    &A

    Page &P

    Sheet8

    36.309941520571.181663837

    37.765432098872.4946236559

    39.392156862773.9620493359

    41.222222222275.6129032258

    43.296296296377.4838709677

    45.666666666779.6221198157

    48.401709401782.0893300248

    51.592592592684.9677419355

    55.363636363688.3695014663

    59.888888888992.4516129032

    65.419753086497.4408602151

    72.3333333333103.6774193548

    81.2222222222111.6958525346

    93.0740740741122.3870967742

    109.6666666667137.3548387097

    134.5555555556159.8064516129

    176.037037037197.2258064516

    259272.064516129

    507.8888888889496.5806451613

    632.3333333333608.8387096774

    839.7407407407795.935483871

    &A

    Page &P

    n8, d3, k10

    n8, d2, k32

    Flits per cycle per processor

    Latency

    Sheet9

    &A

    Page &P

    Sheet10

    &A

    Page &P

    Sheet11

    &A

    Page &P

    Sheet12

    &A

    Page &P

    Sheet13

    &A

    Page &P

    Sheet14

    &A

    Page &P

    Sheet15

    &A

    Page &P

    Sheet16

    &A

    Page &P

    &A

    Page &P

  • In the 3D worldFor n nodes, bisection area is O(n2/3 )

    For large n, bisection bandwidth is limited to O(n2/3 )Bill Dally, IEEE TPDS, [Dal90a]For fixed bisection bandwidth, low-dimensional k-ary n-cubes are better (otherwise higher is better)i.e., a few short fat wires are better than many long thin wiresWhat about many long fat wires?

    cs252-S09, Lecture 21

  • Dally paper (cont)Equal Bisection,W=1 for hypercube W= k Three wire models:Constant delay, independent of lengthLogarithmic delay with length (exponential driver tree)Linear delay (speed of light/optimal repeaters)

    cs252-S09, Lecture 21

  • Equal cost in k-ary n-cubesEqual number of nodes?Equal number of pins/wires?Equal bisection bandwidth?Equal area?Equal wire length?

    What do we know?switch degree: ddiameter = d(k-1)total links = Ndpins per node = 2wdbisection = kd-1 = N/k links in each directions2Nw/k wires cross the middle

    cs252-S09, Lecture 21

  • Latency for Equal Width Channelstotal links(N) = Nd

    cs252-S09, Lecture 21

    Chart2

    701022942086

    56.048812623667.2381051975113.1952504945341.7810019779

    5258.62741699881.2548339959164

    50.15716566515569.8220225318115

    49.119052598753.048812623664.238105197594.476210395

    48.457253191451.84260269696183.7205251988

    4851.0273138458.908685288177.2548339959

    950.441075300157.45522086172.9950454247

    105056.390158215570

    111155.577967776967.7900152198

    121254.939089159466.0976252472

    131354.423917988564.762994259

    14145463.6852053937

    15151562.7976314968

    16161662.0546276801

    17171761.4239366366

    18181860.8821506001

    19191960.411891689

    20202060

    256

    1024

    16384

    1048576

    Dimension

    Average Latency (n = 40, D = 2)

    Sheet1

    Latency with Fixed

    Hops per dimAve Hops

    Ndkkd = (k-1)/2h = d*kdT(n) = n/w + hr

    2562167.515

    25636.34960420792.67480210398.0244063118

    256441.56

    25653.0314331331.01571656655.0785828326

    25662.51984209980.75992104994.5595262994

    25672.20817902730.60408951374.2286265957

    256820.54

    Ndkkd = (k-1)/2h = d*kd

    102423215.531

    1024310.07936839924.539684199613.6190525987

    102445.65685424952.32842712479.313708499

    1024541.57.5

    102463.17480210391.0874010526.5244063118

    102472.69180038530.84590019265.9213013484

    102482.378414230.6892071155.51365692

    102492.16011947780.58005973895.22053765

    10241020.55

    Ndkkd = (k-1)/2h = d*kd

    16384212863.5127

    16384325.398416831512.199208415736.5976252472

    16384411.3137084995.156854249520.627416998

    1638456.96440450642.982202253214.9110112659

    1638465.03968419962.019842099812.1190525987

    16384741.510.5

    1638483.3635856611.18179283059.4543426441

    1638492.93946898460.96973449238.7276104305

    16384102.63901582150.81950791088.1950791077

    16384112.41617888880.70808944447.7889838884

    16384122.24492409660.62246204837.4695445797

    16384132.1095321530.55476607657.2119589943

    163841420.57

    Ndkkd = (k-1)/2h = d*kd

    104857621024511.51023

    10485763101.59366732650.296833663150.8905009889

    104857643215.562

    10485765167.537.5

    1048576610.07936839924.539684199627.2381051975

    104857677.24578931413.122894657121.8602625994

    104857685.65685424952.328427124718.627416998

    104857694.66611615831.833058079216.4975227124

    10485761041.515

    1048576113.526365021.2631825113.8950076099

    1048576123.17480210391.08740105213.0488126236

    1048576132.90484571220.952422856112.3814971295

    1048576142.69180038530.845900192611.8426026969

    1048576152.51984209980.759921049911.3988157484

    1048576162.378414230.68920711511.02731384

    1048576172.26023156690.630115783410.7119683183

    1048576182.16011947780.580059738910.4410753001

    1048576192.07431008890.537155044410.2059458445

    10485762020.510

    Routing Delay for fixed R

    &A

    Page &P

    ave dist

    Routing Delay for Fixed R

    N

    d2561024163841048576

    215311271023

    38.024406311813.619052598736.5976252472150.8905009889

    469.31370849920.62741699862

    55.07858283267.514.911011265937.5

    64.55952629946.524406311812.119052598727.2381051975

    74.22862659575.921301348410.521.8602625994

    845.513656929.454342644118.627416998

    95.220537658.727610430516.4975227124

    1058.195079107715

    117.788983888413.8950076099

    127.469544579713.0488126236

    137.211958994312.3814971295

    14711.8426026969

    1511.3988157484

    1611.02731384

    1710.7119683183

    1810.4410753001

    1910.2059458445

    2010

    &A

    Page &P

    ave dist

    &A

    Page &P

    256

    1024

    16384

    1048576

    Dimension

    Ave Distance

    fig7-14-equal-node

    Latency with fixed channel width

    L40320 b140

    R22

    NN

    d2561024163841048576d2561024163841048576

    270102294208621702023942186

    356.048812623667.2381051975113.1952504945341.78100197793156.0488126236167.2381051975213.1952504945441.7810019779

    45258.62741699881.25483399591644152158.627416998181.2548339959264

    550.15716566515569.82202253181155150.1571656651155169.8220225318215

    649.119052598753.048812623664.238105197594.4762103956149.1190525987153.0488126236164.2381051975194.476210395

    748.457253191451.84260269696183.72052519887148.4572531914151.8426026969161183.7205251988

    84851.0273138458.908685288177.25483399598148151.02731384158.9086852881177.2548339959

    950.441075300157.45522086172.99504542479150.4410753001157.455220861172.9950454247

    105056.39015821557010150156.3901582155170

    1155.577967776967.790015219811155.5779677769167.7900152198

    1254.939089159466.097625247212154.9390891594166.0976252472

    1354.423917988564.76299425913154.4239179885164.762994259

    145463.685205393714154163.6852053937

    1562.797631496815162.7976314968

    1662.054627680116162.0546276801

    1761.423936636617161.4239366366

    1860.882150600118160.8821506001

    1960.41189168919160.411891689

    206020160

    &A

    Page &P

    fig7-14-equal-node

    &A

    Page &P

    256

    1024

    16384

    1048576

    Dimension

    Average Latency (n = 40, D = 2)

    fig-10-15-equal-pin

    Latency with fixed Pin count

    L320

    R2

    out pins64

    N

    w=64/6d2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    32240722642056

    21.3333333333331.048812623642.238105197588.1952504945316.7810019779

    1643238.62741699861.2548339959144

    12.8535.15716566514054.8220225318100

    10.6666666667639.119052598743.048812623654.238105197584.476210395

    9.1428571429743.457253191446.84260269695678.7205251988

    884851.0273138458.908685288177.2548339959

    7.1111111111955.441075300162.45522086177.9950454247

    6.4106066.390158215580

    5.81818181821170.577967776982.7900152198

    5.33333333331274.939089159486.0976252472

    4.92307692311379.423917988589.762994259

    4.5714285714148493.6852053937

    4.26666666671597.7976314968

    416102.0546276801

    3.764705882417106.4239366366

    3.555555555618110.8821506001

    3.368421052619115.411891689

    3.220120

    &A

    Page &P

    fig-10-15-equal-pin

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n=40B)

    fig-10-15 (2)

    Latency with fixed Pin count

    L1120

    R2

    out pins64

    N

    wd2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    32265972892081

    21.3333333333368.548812623679.7381051975125.6952504945354.2810019779

    1648288.627416998111.2548339959194

    12.8597.6571656651102.5117.3220225318162.5

    10.66666666676114.1190525987118.0488126236129.2381051975159.476210395

    9.14285714297130.9572531914134.3426026969143.5166.2205251988

    88148151.02731384158.9086852881177.2548339959

    7.11111111119167.9410753001174.955220861190.4950454247

    6.410185191.3901582155205

    5.818181818211208.0779677769220.2900152198

    5.333333333312224.9390891594236.0976252472

    4.923076923113241.9239179885252.262994259

    4.571428571414259268.6852053937

    4.266666666715285.2976314968

    416302.0546276801

    3.764705882417318.9239366366

    3.555555555618335.8821506001

    3.368421052619352.911891689

    3.220370

    &A

    Page &P

    fig-10-15 (2)

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n= 140 B)

    fig-10-16-wire

    Latency with fixed bisection

    L640

    R2

    out pins64

    N

    w = k/2d25610241638410485762561024163841048576

    W=k/2

    256 nodes1024 nodes16 k nodes1M nodes

    21101022642047.2581664512

    3217.6361806068154.2301893549123.5920924903314.38021247683.17480210395.039684199612.699208415750.796833663

    4332244.9015869777154.391918985816422.82842712475.656854249516

    5432.3996971124335213.61375933141551.515716566523.48220225328

    6517.0873892286416.22354859278.2222735124181.46829455241.25992104991.5874010522.51984209985.0396841996

    7588.1203983203487.3607752207341220.37484738661.10408951371.345900192623.6228946571

    8648549.2010196024399.454962089263.529003975611.1892071151.68179283052.8284271247

    9603.0008911639452.9080209168307.31311993771.08005973891.46973449232.3330580792

    10650501.419459498835011.31950791082

    11545.3400669032390.77002232241.20808944441.76318251

    12585.1142687692429.27236121361.12246204831.587401052

    13621.1935271656465.40601864111.05476607651.4524228561

    14654499.203377917511.3459001926

    15530.76596812671.2599210499

    16560.22833344251.189207115

    17587.73759173651.1301157834

    18613.4419664641.0800597389

    19637.48453037221.0371550444

    206601

    &A

    Page &P

    fig-10-16-wire

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n=40)

    fig-10-17

    Latency with fixed Pin count

    L1120

    R20

    out pins64

    N

    wd2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    322335655257520495

    21.33333333333212.9881262362324.8810519748784.45250494473070.3100197789

    164190256.2741699797482.54833995941310

    12.85189.071656651237.5385.7202253184837.5

    10.66666666676196.1905259874235.4881262362347.3810519748649.7621039495

    9.14285714297207.0725319143240.9260269685332.5559.7052519878

    88220250.2731384004329.0868528812512.5483399594

    7.11111111119261.9107530006332.0522086096487.4504542474

    6.410275338.9015821546475

    5.818181818211348.279677769470.4001521982

    5.333333333312359.3908915942470.9762524724

    4.923076923113371.7391798852475.1299425897

    4.571428571414385481.8520539371

    4.266666666715490.4763149685

    416500.5462768009

    3.764705882417511.739366366

    3.555555555618523.8215060012

    3.368421052619536.6189168896

    3.220550

    &A

    Page &P

    fig-10-17

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n= 140 B)

    fig-10-18&19

    Saturation with load

    N1000

    k10

    d3

    R2

    N1000

    Kd4.5flits/c0.22222222220.064516129

    wait0.2304526749N10001024

    have13.5k1032

    d32

    n/w481640

    Base Lat31354367667078102

    rhorho/(1-rho)Wait factorn+h(r+q*n)n+h(r+q*n)n+h(r+q*n)n+h(r+q*n)

    n/wRhon4,d3,k10n8,d3,k10n16,d3,k10n40,d3,k10n4,d2,k32n8,d2,k32n16,d2,k32n40,d2,k32

    0.050.05263157890.01212908820.0531.654970760236.309941520545.619883040973.549707602366.590831918571.18166383780.363327674107.9083191851

    0.10.11111111110.02560585280.132.382716049437.765432098848.530864197580.827160493867.24731182872.494623655982.9892473118114.4731182796

    0.150.17647058820.04066811910.1533.196078431439.392156862751.784313725588.960784313767.981024667973.962049335985.9240986717121.8102466793

    0.20.250.05761316870.234.111111111141.222222222255.444444444498.111111111168.806451612975.612903225889.2258064516130.064516129

    0.250.33333333330.07681755830.2535.148148148143.296296296359.5925925926108.481481481569.741935483977.483870967792.9677419355139.4193548387

    0.30.42857142860.09876543210.336.333333333345.666666666764.3333333333120.333333333370.811059907879.622119815797.2442396313150.1105990783

    0.350.53846153850.12408990190.3537.700854700948.401709401769.8034188034134.008547008572.044665012482.0893300248102.1786600496162.4466501241

    0.40.66666666670.15363511660.439.296296296351.592592592676.1851851852149.96296296373.483870967784.9677419355107.935483871176.8387096774

    0.450.81818181820.18855218860.4541.181818181855.363636363683.7272727273168.818181818275.184750733188.3695014663114.7390029326193.8475073314

    0.510.23045267490.543.444444444459.888888888992.7777777778191.444444444477.225806451692.4516129032122.9032258065214.2580645161

    0.551.22222222220.28166438040.5546.209876543265.4197530864103.8395061728219.098765432179.720430107597.4408602151132.8817204301239.2043010753

    0.61.50.34567901230.649.666666666772.3333333333117.6666666667253.666666666782.8387096774103.6774193548145.3548387097270.3870967742

    0.651.85714285710.42798353910.6554.111111111181.2222222222135.4444444444298.111111111186.8479262673111.6958525346161.3917050691310.4792626728

    0.72.33333333330.53772290810.760.03703703793.0740740741159.1481481481357.370370370492.1935483871122.3870967742182.7741935484363.935483871

    0.7530.69135802470.7568.3333333333109.6666666667192.3333333333440.333333333399.6774193548137.3548387097212.7096774194438.7741935484

    0.840.92181069960.880.7777777778134.5555555556242.1111111111564.7777777778110.9032258065159.8064516129257.6129032258551.0322580645

    0.855.66666666671.30589849110.85101.5185185185176.037037037325.0740740741772.1851851852129.6129032258197.2258064516332.4516129032738.1290322581

    0.992.07407407410.91432594911187167.0322580645272.064516129482.12903225811112.3225806452

    0.95194.3786008230.95267.4444444444507.8888888889988.77777777782431.4444444445279.2903225806496.5806451613931.16129032262234.9032258065

    0.96245.53086419750.96329.6666666667632.33333333331237.66666666673053.6666666667335.4193548387608.83870967741155.67741935482796.1935483871

    0.9732.33333333337.4513031550.97433.3703703704839.74074074071652.48148148154090.7037037038428.9677419355795.9354838711529.8709677423731.6774193549

    0.984911.292181070.98640.77777777781254.55555555562482.11111111126164.7777777779616.0645161291170.12903225812278.25806451625602.6451612904

    0.999922.81481481480.9912632499.00000000014971.000000000212387.00000000041177.35483870972292.70967741944523.419354838811215.5483870971

    N1024

    k32

    d2

    R2

    msg per cycle per node0.064516129

    &A

    Page &P

    fig-10-18&19

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    n40,d3,k10

    n16,d3,k10

    n8,d3,k10

    n4,d3,k10

    Ave Channel Utilization

    Latency

    Sheet6 (2)

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    &A

    Page &P

    n40,d2,k32

    n40,d3,k10

    n16,d2,k32

    n16,d3,k10

    n8,d2,k32

    n8,d3,k10

    n4,d2,k32

    n4,d3,k10

    Channel Utilization

    Latency

    Sheet7

    Saturation with load

    N1024

    k32

    d2

    R2

    N1024

    Kd15.5

    wait0.0905306972

    have31

    n/w481640

    Base Lat667078102

    rhorho/(1-rho)Wait factorn+h(r+q*n)n+h(r+q*n)n+h(r+q*n)n+h(r+q*n)

    n/wRhon/w = 4n/w=8n/w=16n/w=40

    0.050.05263157890.00476477350.0566.590831918571.18166383780.363327674107.9083191851

    0.10.11111111110.01005896640.167.24731182872.494623655982.9892473118114.4731182796

    0.150.17647058820.01597600540.1567.981024667973.962049335985.9240986717121.8102466793

    0.20.250.02263267430.268.806451612975.612903225889.2258064516130.064516129

    0.250.33333333330.03017689910.2569.741935483977.483870967792.9677419355139.4193548387

    0.30.42857142860.03879887020.370.811059907879.622119815797.2442396313150.1105990783

    0.350.53846153850.04874729850.3572.044665012482.0893300248102.1786600496162.4466501241

    0.40.66666666670.06035379810.473.483870967784.9677419355107.935483871176.8387096774

    0.450.81818181820.07407057040.4575.184750733188.3695014663114.7390029326193.8475073314

    0.510.09053069720.577.225806451692.4516129032122.9032258065214.2580645161

    0.551.22222222220.11064862990.5579.720430107597.4408602151132.8817204301239.2043010753

    0.61.50.13579604580.682.8387096774103.6774193548145.3548387097270.3870967742

    0.651.85714285710.16812843760.6586.8479262673111.6958525346161.3917050691310.4792626728

    0.72.33333333330.21123829340.792.1935483871122.3870967742182.7741935484363.935483871

    0.7530.27159209160.7599.6774193548137.3548387097212.7096774194438.7741935484

    0.840.36212278880.8110.9032258065159.8064516129257.6129032258551.0322580645

    0.855.66666666670.51300728410.85129.6129032258197.2258064516332.4516129032738.1290322581

    0.990.81477627470.9167.0322580645272.064516129482.12903225811112.3225806452

    0.95191.72008324660.95279.2903225806496.5806451613931.16129032262234.9032258065

    0.96242.17273673260.96335.4193548387608.83870967741155.67741935482796.1935483871

    0.9732.33333333332.92715920920.97428.9677419355795.9354838711529.8709677423731.6774193549

    0.98494.43600416230.98616.0645161291170.12903225812278.25806451625602.6451612904

    0.99998.96253902190.991177.35483870972292.70967741944523.419354838811215.5483870971

    N1000

    k10

    d3

    R2

    flits per cycle per node0.2222222222

    &A

    Page &P

    Sheet7

    &A

    Page &P

    n/w=40

    n/w=16

    n/w=8

    n/w = 4

    Ave Channel Utilization

    Latency

    Sheet8

    Latency for given offered load

    1032

    rhoflitsn8, d3, k10Flitsn8, d2, k32

    0.050.011111111136.30994152050.003225806571.181663837

    0.10.022222222237.76543209880.006451612972.4946236559

    0.150.033333333339.39215686270.009677419473.9620493359

    0.20.044444444441.22222222220.012903225875.6129032258

    0.250.055555555643.29629629630.016129032377.4838709677

    0.30.066666666745.66666666670.019354838779.6221198157

    0.350.077777777848.40170940170.022580645282.0893300248

    0.40.088888888951.59259259260.025806451684.9677419355

    0.450.155.36363636360.029032258188.3695014663

    0.50.111111111159.88888888890.032258064592.4516129032

    0.550.122222222265.41975308640.03548387197.4408602151

    0.60.133333333372.33333333330.0387096774103.6774193548

    0.650.144444444481.22222222220.0419354839111.6958525346

    0.70.155555555693.07407407410.0451612903122.3870967742

    0.750.1666666667109.66666666670.0483870968137.3548387097

    0.80.1777777778134.55555555560.0516129032159.8064516129

    0.850.1888888889176.0370370370.0548387097197.2258064516

    0.90.22590.0580645161272.064516129

    0.950.2111111111507.88888888890.0612903226496.5806451613

    0.960.2133333333632.33333333330.0619354839608.8387096774

    0.970.2155555556839.74074074070.0625806452795.935483871

    &A

    Page &P

    Sheet8

    36.309941520571.181663837

    37.765432098872.4946236559

    39.392156862773.9620493359

    41.222222222275.6129032258

    43.296296296377.4838709677

    45.666666666779.6221198157

    48.401709401782.0893300248

    51.592592592684.9677419355

    55.363636363688.3695014663

    59.888888888992.4516129032

    65.419753086497.4408602151

    72.3333333333103.6774193548

    81.2222222222111.6958525346

    93.0740740741122.3870967742

    109.6666666667137.3548387097

    134.5555555556159.8064516129

    176.037037037197.2258064516

    259272.064516129

    507.8888888889496.5806451613

    632.3333333333608.8387096774

    839.7407407407795.935483871

    &A

    Page &P

    n8, d3, k10

    n8, d2, k32

    Flits per cycle per processor

    Latency

    Sheet9

    &A

    Page &P

    Sheet10

    &A

    Page &P

    Sheet11

    &A

    Page &P

    Sheet12

    &A

    Page &P

    Sheet13

    &A

    Page &P

    Sheet14

    &A

    Page &P

    Sheet15

    &A

    Page &P

    Sheet16

    &A

    Page &P

    &A

    Page &P

  • Latency with Equal Pin CountBaseline d=2, has w = 32 (128 wires per node)fix 2dw pins => w(d) = 64/ddistance up with d, but channel time down

    cs252-S09, Lecture 21

    Chart3

    40722642056

    31.048812623642.238105197588.1952504945316.7810019779

    3238.62741699861.2548339959144

    35.15716566514054.8220225318100

    39.119052598743.048812623654.238105197584.476210395

    43.457253191446.84260269695678.7205251988

    4851.0273138458.908685288177.2548339959

    955.441075300162.45522086177.9950454247

    106066.390158215580

    111170.577967776982.7900152198

    121274.939089159486.0976252472

    131379.423917988589.762994259

    14148493.6852053937

    15151597.7976314968

    161616102.0546276801

    171717106.4239366366

    181818110.8821506001

    191919115.411891689

    202020120

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n=40B)

    Sheet1

    Latency with Fixed

    Hops per dimAve Hops

    Ndkkd = (k-1)/2h = d*kdT(n) = n/w + hr

    2562167.515

    25636.34960420792.67480210398.0244063118

    256441.56

    25653.0314331331.01571656655.0785828326

    25662.51984209980.75992104994.5595262994

    25672.20817902730.60408951374.2286265957

    256820.54

    Ndkkd = (k-1)/2h = d*kd

    102423215.531

    1024310.07936839924.539684199613.6190525987

    102445.65685424952.32842712479.313708499

    1024541.57.5

    102463.17480210391.0874010526.5244063118

    102472.69180038530.84590019265.9213013484

    102482.378414230.6892071155.51365692

    102492.16011947780.58005973895.22053765

    10241020.55

    Ndkkd = (k-1)/2h = d*kd

    16384212863.5127

    16384325.398416831512.199208415736.5976252472

    16384411.3137084995.156854249520.627416998

    1638456.96440450642.982202253214.9110112659

    1638465.03968419962.019842099812.1190525987

    16384741.510.5

    1638483.3635856611.18179283059.4543426441

    1638492.93946898460.96973449238.7276104305

    16384102.63901582150.81950791088.1950791077

    16384112.41617888880.70808944447.7889838884

    16384122.24492409660.62246204837.4695445797

    16384132.1095321530.55476607657.2119589943

    163841420.57

    Ndkkd = (k-1)/2h = d*kd

    104857621024511.51023

    10485763101.59366732650.296833663150.8905009889

    104857643215.562

    10485765167.537.5

    1048576610.07936839924.539684199627.2381051975

    104857677.24578931413.122894657121.8602625994

    104857685.65685424952.328427124718.627416998

    104857694.66611615831.833058079216.4975227124

    10485761041.515

    1048576113.526365021.2631825113.8950076099

    1048576123.17480210391.08740105213.0488126236

    1048576132.90484571220.952422856112.3814971295

    1048576142.69180038530.845900192611.8426026969

    1048576152.51984209980.759921049911.3988157484

    1048576162.378414230.68920711511.02731384

    1048576172.26023156690.630115783410.7119683183

    1048576182.16011947780.580059738910.4410753001

    1048576192.07431008890.537155044410.2059458445

    10485762020.510

    Routing Delay for fixed R

    &A

    Page &P

    ave dist

    Routing Delay for Fixed R

    N

    d2561024163841048576

    215311271023

    38.024406311813.619052598736.5976252472150.8905009889

    469.31370849920.62741699862

    55.07858283267.514.911011265937.5

    64.55952629946.524406311812.119052598727.2381051975

    74.22862659575.921301348410.521.8602625994

    845.513656929.454342644118.627416998

    95.220537658.727610430516.4975227124

    1058.195079107715

    117.788983888413.8950076099

    127.469544579713.0488126236

    137.211958994312.3814971295

    14711.8426026969

    1511.3988157484

    1611.02731384

    1710.7119683183

    1810.4410753001

    1910.2059458445

    2010

    &A

    Page &P

    ave dist

    &A

    Page &P

    256

    1024

    16384

    1048576

    Dimension

    Ave Distance

    fig7-14-equal-node

    Latency with fixed channel width

    L40320 b140

    R22

    NN

    d2561024163841048576d2561024163841048576

    270102294208621702023942186

    356.048812623667.2381051975113.1952504945341.78100197793156.0488126236167.2381051975213.1952504945441.7810019779

    45258.62741699881.25483399591644152158.627416998181.2548339959264

    550.15716566515569.82202253181155150.1571656651155169.8220225318215

    649.119052598753.048812623664.238105197594.4762103956149.1190525987153.0488126236164.2381051975194.476210395

    748.457253191451.84260269696183.72052519887148.4572531914151.8426026969161183.7205251988

    84851.0273138458.908685288177.25483399598148151.02731384158.9086852881177.2548339959

    950.441075300157.45522086172.99504542479150.4410753001157.455220861172.9950454247

    105056.39015821557010150156.3901582155170

    1155.577967776967.790015219811155.5779677769167.7900152198

    1254.939089159466.097625247212154.9390891594166.0976252472

    1354.423917988564.76299425913154.4239179885164.762994259

    145463.685205393714154163.6852053937

    1562.797631496815162.7976314968

    1662.054627680116162.0546276801

    1761.423936636617161.4239366366

    1860.882150600118160.8821506001

    1960.41189168919160.411891689

    206020160

    &A

    Page &P

    fig7-14-equal-node

    &A

    Page &P

    256

    1024

    16384

    1048576

    Dimension

    Average Latency (n = 40, D = 2)

    fig-10-15-equal-pin

    Latency with fixed Pin count

    L320

    R2

    out pins64

    N

    w=64/6d2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    32240722642056

    21.3333333333331.048812623642.238105197588.1952504945316.7810019779

    1643238.62741699861.2548339959144

    12.8535.15716566514054.8220225318100

    10.6666666667639.119052598743.048812623654.238105197584.476210395

    9.1428571429743.457253191446.84260269695678.7205251988

    884851.0273138458.908685288177.2548339959

    7.1111111111955.441075300162.45522086177.9950454247

    6.4106066.390158215580

    5.81818181821170.577967776982.7900152198

    5.33333333331274.939089159486.0976252472

    4.92307692311379.423917988589.762994259

    4.5714285714148493.6852053937

    4.26666666671597.7976314968

    416102.0546276801

    3.764705882417106.4239366366

    3.555555555618110.8821506001

    3.368421052619115.411891689

    3.220120

    &A

    Page &P

    fig-10-15-equal-pin

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n=40B)

    fig-10-15 (2)

    Latency with fixed Pin count

    L1120

    R2

    out pins64

    N

    wd2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    32265972892081

    21.3333333333368.548812623679.7381051975125.6952504945354.2810019779

    1648288.627416998111.2548339959194

    12.8597.6571656651102.5117.3220225318162.5

    10.66666666676114.1190525987118.0488126236129.2381051975159.476210395

    9.14285714297130.9572531914134.3426026969143.5166.2205251988

    88148151.02731384158.9086852881177.2548339959

    7.11111111119167.9410753001174.955220861190.4950454247

    6.410185191.3901582155205

    5.818181818211208.0779677769220.2900152198

    5.333333333312224.9390891594236.0976252472

    4.923076923113241.9239179885252.262994259

    4.571428571414259268.6852053937

    4.266666666715285.2976314968

    416302.0546276801

    3.764705882417318.9239366366

    3.555555555618335.8821506001

    3.368421052619352.911891689

    3.220370

    &A

    Page &P

    fig-10-15 (2)

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n= 140 B)

    fig-10-16-wire

    Latency with fixed bisection

    L640

    R2

    out pins64

    N

    w = k/2d25610241638410485762561024163841048576

    W=k/2

    256 nodes1024 nodes16 k nodes1M nodes

    21101022642047.2581664512

    3217.6361806068154.2301893549123.5920924903314.38021247683.17480210395.039684199612.699208415750.796833663

    4332244.9015869777154.391918985816422.82842712475.656854249516

    5432.3996971124335213.61375933141551.515716566523.48220225328

    6517.0873892286416.22354859278.2222735124181.46829455241.25992104991.5874010522.51984209985.0396841996

    7588.1203983203487.3607752207341220.37484738661.10408951371.345900192623.6228946571

    8648549.2010196024399.454962089263.529003975611.1892071151.68179283052.8284271247

    9603.0008911639452.9080209168307.31311993771.08005973891.46973449232.3330580792

    10650501.419459498835011.31950791082

    11545.3400669032390.77002232241.20808944441.76318251

    12585.1142687692429.27236121361.12246204831.587401052

    13621.1935271656465.40601864111.05476607651.4524228561

    14654499.203377917511.3459001926

    15530.76596812671.2599210499

    16560.22833344251.189207115

    17587.73759173651.1301157834

    18613.4419664641.0800597389

    19637.48453037221.0371550444

    206601

    &A

    Page &P

    fig-10-16-wire

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n=40)

    fig-10-17

    Latency with fixed Pin count

    L1120

    R20

    out pins64

    N

    wd2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    322335655257520495

    21.33333333333212.9881262362324.8810519748784.45250494473070.3100197789

    164190256.2741699797482.54833995941310

    12.85189.071656651237.5385.7202253184837.5

    10.66666666676196.1905259874235.4881262362347.3810519748649.7621039495

    9.14285714297207.0725319143240.9260269685332.5559.7052519878

    88220250.2731384004329.0868528812512.5483399594

    7.11111111119261.9107530006332.0522086096487.4504542474

    6.410275338.9015821546475

    5.818181818211348.279677769470.4001521982

    5.333333333312359.3908915942470.9762524724

    4.923076923113371.7391798852475.1299425897

    4.571428571414385481.8520539371

    4.266666666715490.4763149685

    416500.5462768009

    3.764705882417511.739366366

    3.555555555618523.8215060012

    3.368421052619536.6189168896

    3.220550

    &A

    Page &P

    fig-10-17

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n= 140 B)

    fig-10-18&19

    Saturation with load

    N1000

    k10

    d3

    R2

    N1000

    Kd4.5flits/c0.22222222220.064516129

    wait0.2304526749N10001024

    have13.5k1032

    d32

    n/w481640

    Base Lat31354367667078102

    rhorho/(1-rho)Wait factorn+h(r+q*n)n+h(r+q*n)n+h(r+q*n)n+h(r+q*n)

    n/wRhon4,d3,k10n8,d3,k10n16,d3,k10n40,d3,k10n4,d2,k32n8,d2,k32n16,d2,k32n40,d2,k32

    0.050.05263157890.01212908820.0531.654970760236.309941520545.619883040973.549707602366.590831918571.18166383780.363327674107.9083191851

    0.10.11111111110.02560585280.132.382716049437.765432098848.530864197580.827160493867.24731182872.494623655982.9892473118114.4731182796

    0.150.17647058820.04066811910.1533.196078431439.392156862751.784313725588.960784313767.981024667973.962049335985.9240986717121.8102466793

    0.20.250.05761316870.234.111111111141.222222222255.444444444498.111111111168.806451612975.612903225889.2258064516130.064516129

    0.250.33333333330.07681755830.2535.148148148143.296296296359.5925925926108.481481481569.741935483977.483870967792.9677419355139.4193548387

    0.30.42857142860.09876543210.336.333333333345.666666666764.3333333333120.333333333370.811059907879.622119815797.2442396313150.1105990783

    0.350.53846153850.12408990190.3537.700854700948.401709401769.8034188034134.008547008572.044665012482.0893300248102.1786600496162.4466501241

    0.40.66666666670.15363511660.439.296296296351.592592592676.1851851852149.96296296373.483870967784.9677419355107.935483871176.8387096774

    0.450.81818181820.18855218860.4541.181818181855.363636363683.7272727273168.818181818275.184750733188.3695014663114.7390029326193.8475073314

    0.510.23045267490.543.444444444459.888888888992.7777777778191.444444444477.225806451692.4516129032122.9032258065214.2580645161

    0.551.22222222220.28166438040.5546.209876543265.4197530864103.8395061728219.098765432179.720430107597.4408602151132.8817204301239.2043010753

    0.61.50.34567901230.649.666666666772.3333333333117.6666666667253.666666666782.8387096774103.6774193548145.3548387097270.3870967742

    0.651.85714285710.42798353910.6554.111111111181.2222222222135.4444444444298.111111111186.8479262673111.6958525346161.3917050691310.4792626728

    0.72.33333333330.53772290810.760.03703703793.0740740741159.1481481481357.370370370492.1935483871122.3870967742182.7741935484363.935483871

    0.7530.69135802470.7568.3333333333109.6666666667192.3333333333440.333333333399.6774193548137.3548387097212.7096774194438.7741935484

    0.840.92181069960.880.7777777778134.5555555556242.1111111111564.7777777778110.9032258065159.8064516129257.6129032258551.0322580645

    0.855.66666666671.30589849110.85101.5185185185176.037037037325.0740740741772.1851851852129.6129032258197.2258064516332.4516129032738.1290322581

    0.992.07407407410.91432594911187167.0322580645272.064516129482.12903225811112.3225806452

    0.95194.3786008230.95267.4444444444507.8888888889988.77777777782431.4444444445279.2903225806496.5806451613931.16129032262234.9032258065

    0.96245.53086419750.96329.6666666667632.33333333331237.66666666673053.6666666667335.4193548387608.83870967741155.67741935482796.1935483871

    0.9732.33333333337.4513031550.97433.3703703704839.74074074071652.48148148154090.7037037038428.9677419355795.9354838711529.8709677423731.6774193549

    0.984911.292181070.98640.77777777781254.55555555562482.11111111126164.7777777779616.0645161291170.12903225812278.25806451625602.6451612904

    0.999922.81481481480.9912632499.00000000014971.000000000212387.00000000041177.35483870972292.70967741944523.419354838811215.5483870971

    N1024

    k32

    d2

    R2

    msg per cycle per node0.064516129

    &A

    Page &P

    fig-10-18&19

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    &A

    Page &P

    n40,d3,k10

    n16,d3,k10

    n8,d3,k10

    n4,d3,k10

    Ave Channel Utilization

    Latency

    Sheet6 (2)

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    &A

    Page &P

    n40,d2,k32

    n40,d3,k10

    n16,d2,k32

    n16,d3,k10

    n8,d2,k32

    n8,d3,k10

    n4,d2,k32

    n4,d3,k10

    Channel Utilization

    Latency

    Sheet7

    Saturation with load

    N1024

    k32

    d2

    R2

    N1024

    Kd15.5

    wait0.0905306972

    have31

    n/w481640

    Base Lat667078102

    rhorho/(1-rho)Wait factorn+h(r+q*n)n+h(r+q*n)n+h(r+q*n)n+h(r+q*n)

    n/wRhon/w = 4n/w=8n/w=16n/w=40

    0.050.05263157890.00476477350.0566.590831918571.18166383780.363327674107.9083191851

    0.10.11111111110.01005896640.167.24731182872.494623655982.9892473118114.4731182796

    0.150.17647058820.01597600540.1567.981024667973.962049335985.9240986717121.8102466793

    0.20.250.02263267430.268.806451612975.612903225889.2258064516130.064516129

    0.250.33333333330.03017689910.2569.741935483977.483870967792.9677419355139.4193548387

    0.30.42857142860.03879887020.370.811059907879.622119815797.2442396313150.1105990783

    0.350.53846153850.04874729850.3572.044665012482.0893300248102.1786600496162.4466501241

    0.40.66666666670.06035379810.473.483870967784.9677419355107.935483871176.8387096774

    0.450.81818181820.07407057040.4575.184750733188.3695014663114.7390029326193.8475073314

    0.510.09053069720.577.225806451692.4516129032122.9032258065214.2580645161

    0.551.22222222220.11064862990.5579.720430107597.4408602151132.8817204301239.2043010753

    0.61.50.13579604580.682.8387096774103.6774193548145.3548387097270.3870967742

    0.651.85714285710.16812843760.6586.8479262673111.6958525346161.3917050691310.4792626728

    0.72.33333333330.21123829340.792.1935483871122.3870967742182.7741935484363.935483871

    0.7530.27159209160.7599.6774193548137.3548387097212.7096774194438.7741935484

    0.840.36212278880.8110.9032258065159.8064516129257.6129032258551.0322580645

    0.855.66666666670.51300728410.85129.6129032258197.2258064516332.4516129032738.1290322581

    0.990.81477627470.9167.0322580645272.064516129482.12903225811112.3225806452

    0.95191.72008324660.95279.2903225806496.5806451613931.16129032262234.9032258065

    0.96242.17273673260.96335.4193548387608.83870967741155.67741935482796.1935483871

    0.9732.33333333332.92715920920.97428.9677419355795.9354838711529.8709677423731.6774193549

    0.98494.43600416230.98616.0645161291170.12903225812278.25806451625602.6451612904

    0.99998.96253902190.991177.35483870972292.70967741944523.419354838811215.5483870971

    N1000

    k10

    d3

    R2

    flits per cycle per node0.2222222222

    &A

    Page &P

    Sheet7

    &A

    Page &P

    n/w=40

    n/w=16

    n/w=8

    n/w = 4

    Ave Channel Utilization

    Latency

    Sheet8

    Latency for given offered load

    1032

    rhoflitsn8, d3, k10Flitsn8, d2, k32

    0.050.011111111136.30994152050.003225806571.181663837

    0.10.022222222237.76543209880.006451612972.4946236559

    0.150.033333333339.39215686270.009677419473.9620493359

    0.20.044444444441.22222222220.012903225875.6129032258

    0.250.055555555643.29629629630.016129032377.4838709677

    0.30.066666666745.66666666670.019354838779.6221198157

    0.350.077777777848.40170940170.022580645282.0893300248

    0.40.088888888951.59259259260.025806451684.9677419355

    0.450.155.36363636360.029032258188.3695014663

    0.50.111111111159.88888888890.032258064592.4516129032

    0.550.122222222265.41975308640.03548387197.4408602151

    0.60.133333333372.33333333330.0387096774103.6774193548

    0.650.144444444481.22222222220.0419354839111.6958525346

    0.70.155555555693.07407407410.0451612903122.3870967742

    0.750.1666666667109.66666666670.0483870968137.3548387097

    0.80.1777777778134.55555555560.0516129032159.8064516129

    0.850.1888888889176.0370370370.0548387097197.2258064516

    0.90.22590.0580645161272.064516129

    0.950.2111111111507.88888888890.0612903226496.5806451613

    0.960.2133333333632.33333333330.0619354839608.8387096774

    0.970.2155555556839.74074074070.0625806452795.935483871

    &A

    Page &P

    Sheet8

    36.309941520571.181663837

    37.765432098872.4946236559

    39.392156862773.9620493359

    41.222222222275.6129032258

    43.296296296377.4838709677

    45.666666666779.6221198157

    48.401709401782.0893300248

    51.592592592684.9677419355

    55.363636363688.3695014663

    59.888888888992.4516129032

    65.419753086497.4408602151

    72.3333333333103.6774193548

    81.2222222222111.6958525346

    93.0740740741122.3870967742

    109.6666666667137.3548387097

    134.5555555556159.8064516129

    176.037037037197.2258064516

    259272.064516129

    507.8888888889496.5806451613

    632.3333333333608.8387096774

    839.7407407407795.935483871

    &A

    Page &P

    n8, d3, k10

    n8, d2, k32

    Flits per cycle per processor

    Latency

    Sheet9

    &A

    Page &P

    Sheet10

    &A

    Page &P

    Sheet11

    &A

    Page &P

    Sheet12

    &A

    Page &P

    Sheet13

    &A

    Page &P

    Sheet14

    &A

    Page &P

    Sheet15

    &A

    Page &P

    Sheet16

    &A

    Page &P

    &A

    Page &P

    Chart4

    65972892081

    68.548812623679.7381051975125.6952504945354.2810019779

    8288.627416998111.2548339959194

    97.6571656651102.5117.3220225318162.5

    114.1190525987118.0488126236129.2381051975159.476210395

    130.9572531914134.3426026969143.5166.2205251988

    148151.02731384158.9086852881177.2548339959

    9167.9410753001174.955220861190.4950454247

    10185191.3901582155205

    1111208.0779677769220.2900152198

    1212224.9390891594236.0976252472

    1313241.9239179885252.262994259

    1414259268.6852053937

    151515285.2976314968

    161616302.0546276801

    171717318.9239366366

    181818335.8821506001

    191919352.911891689

    202020370

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n= 140 B)

    Sheet1

    Latency with Fixed

    Hops per dimAve Hops

    Ndkkd = (k-1)/2h = d*kdT(n) = n/w + hr

    2562167.515

    25636.34960420792.67480210398.0244063118

    256441.56

    25653.0314331331.01571656655.0785828326

    25662.51984209980.75992104994.5595262994

    25672.20817902730.60408951374.2286265957

    256820.54

    Ndkkd = (k-1)/2h = d*kd

    102423215.531

    1024310.07936839924.539684199613.6190525987

    102445.65685424952.32842712479.313708499

    1024541.57.5

    102463.17480210391.0874010526.5244063118

    102472.69180038530.84590019265.9213013484

    102482.378414230.6892071155.51365692

    102492.16011947780.58005973895.22053765

    10241020.55

    Ndkkd = (k-1)/2h = d*kd

    16384212863.5127

    16384325.398416831512.199208415736.5976252472

    16384411.3137084995.156854249520.627416998

    1638456.96440450642.982202253214.9110112659

    1638465.03968419962.019842099812.1190525987

    16384741.510.5

    1638483.3635856611.18179283059.4543426441

    1638492.93946898460.96973449238.7276104305

    16384102.63901582150.81950791088.1950791077

    16384112.41617888880.70808944447.7889838884

    16384122.24492409660.62246204837.4695445797

    16384132.1095321530.55476607657.2119589943

    163841420.57

    Ndkkd = (k-1)/2h = d*kd

    104857621024511.51023

    10485763101.59366732650.296833663150.8905009889

    104857643215.562

    10485765167.537.5

    1048576610.07936839924.539684199627.2381051975

    104857677.24578931413.122894657121.8602625994

    104857685.65685424952.328427124718.627416998

    104857694.66611615831.833058079216.4975227124

    10485761041.515

    1048576113.526365021.2631825113.8950076099

    1048576123.17480210391.08740105213.0488126236

    1048576132.90484571220.952422856112.3814971295

    1048576142.69180038530.845900192611.8426026969

    1048576152.51984209980.759921049911.3988157484

    1048576162.378414230.68920711511.02731384

    1048576172.26023156690.630115783410.7119683183

    1048576182.16011947780.580059738910.4410753001

    1048576192.07431008890.537155044410.2059458445

    10485762020.510

    Routing Delay for fixed R

    &A

    Page &P

    ave dist

    Routing Delay for Fixed R

    N

    d2561024163841048576

    215311271023

    38.024406311813.619052598736.5976252472150.8905009889

    469.31370849920.62741699862

    55.07858283267.514.911011265937.5

    64.55952629946.524406311812.119052598727.2381051975

    74.22862659575.921301348410.521.8602625994

    845.513656929.454342644118.627416998

    95.220537658.727610430516.4975227124

    1058.195079107715

    117.788983888413.8950076099

    127.469544579713.0488126236

    137.211958994312.3814971295

    14711.8426026969

    1511.3988157484

    1611.02731384

    1710.7119683183

    1810.4410753001

    1910.2059458445

    2010

    &A

    Page &P

    ave dist

    &A

    Page &P

    256

    1024

    16384

    1048576

    Dimension

    Ave Distance

    fig7-14-equal-node

    Latency with fixed channel width

    L40320 b140

    R22

    NN

    d2561024163841048576d2561024163841048576

    270102294208621702023942186

    356.048812623667.2381051975113.1952504945341.78100197793156.0488126236167.2381051975213.1952504945441.7810019779

    45258.62741699881.25483399591644152158.627416998181.2548339959264

    550.15716566515569.82202253181155150.1571656651155169.8220225318215

    649.119052598753.048812623664.238105197594.4762103956149.1190525987153.0488126236164.2381051975194.476210395

    748.457253191451.84260269696183.72052519887148.4572531914151.8426026969161183.7205251988

    84851.0273138458.908685288177.25483399598148151.02731384158.9086852881177.2548339959

    950.441075300157.45522086172.99504542479150.4410753001157.455220861172.9950454247

    105056.39015821557010150156.3901582155170

    1155.577967776967.790015219811155.5779677769167.7900152198

    1254.939089159466.097625247212154.9390891594166.0976252472

    1354.423917988564.76299425913154.4239179885164.762994259

    145463.685205393714154163.6852053937

    1562.797631496815162.7976314968

    1662.054627680116162.0546276801

    1761.423936636617161.4239366366

    1860.882150600118160.8821506001

    1960.41189168919160.411891689

    206020160

    &A

    Page &P

    fig7-14-equal-node

    &A

    Page &P

    256

    1024

    16384

    1048576

    Dimension

    Average Latency (n = 40, D = 2)

    fig-10-15-equal-pin

    Latency with fixed Pin count

    L320

    R2

    out pins64

    N

    w=64/6d2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    32240722642056

    21.3333333333331.048812623642.238105197588.1952504945316.7810019779

    1643238.62741699861.2548339959144

    12.8535.15716566514054.8220225318100

    10.6666666667639.119052598743.048812623654.238105197584.476210395

    9.1428571429743.457253191446.84260269695678.7205251988

    884851.0273138458.908685288177.2548339959

    7.1111111111955.441075300162.45522086177.9950454247

    6.4106066.390158215580

    5.81818181821170.577967776982.7900152198

    5.33333333331274.939089159486.0976252472

    4.92307692311379.423917988589.762994259

    4.5714285714148493.6852053937

    4.26666666671597.7976314968

    416102.0546276801

    3.764705882417106.4239366366

    3.555555555618110.8821506001

    3.368421052619115.411891689

    3.220120

    &A

    Page &P

    fig-10-15-equal-pin

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n=40B)

    fig-10-15 (2)

    Latency with fixed Pin count

    L1120

    R2

    out pins64

    N

    wd2561024163841048576

    256 nodes1024 nodes16 k nodes1M nodes

    32265972892081

    21.3333333333368.548812623679.7381051975125.6952504945354.2810019779

    1648288.627416998111.2548339959194

    12.8597.6571656651102.5117.3220225318162.5

    10.66666666676114.1190525987118.0488126236129.2381051975159.476210395

    9.14285714297130.9572531914134.3426026969143.5166.2205251988

    88148151.02731384158.9086852881177.2548339959

    7.11111111119167.9410753001174.955220861190.4950454247

    6.410185191.3901582155205

    5.818181818211208.0779677769220.2900152198

    5.333333333312224.9390891594236.0976252472

    4.923076923113241.9239179885252.262994259

    4.571428571414259268.6852053937

    4.266666666715285.2976314968

    416302.0546276801

    3.764705882417318.9239366366

    3.555555555618335.8821506001

    3.368421052619352.911891689

    3.220370

    &A

    Page &P

    fig-10-15 (2)

    &A

    Page &P

    256 nodes

    1024 nodes

    16 k nodes

    1M nodes

    Dimension (d)

    Ave Latency T(n= 140 B)

    fig-10-16-wire

    Latency with fixed bisection

    L640

    R2

    out pins64

    N

    w = k/2d25610241638410485762561024163841048576

    W=k/2

    256 nodes1024 nodes16 k nodes1M nodes

    21101022642047.2581664512

    3217.6361806068154.2301893549123.5920924903314.38021247683.17480210395.039684199612.699208415750.796833663

    4332244.9015869777154.391918985816422.82842712475.656854249516

    5432.3996971124335213.61375933141551.515716566523.48220225328

    6517.0873892286416.22354859278.2222735124181.46829455241.25992104991.5874010522.51984209985.0396841996

    7588.1203983203487.3607752207341220.37484738661.10408951371.345900192623.6228946571

    8648549.2010196024399.454962089263.529003975611.1892071151.68179283052.8284271247

    9603.0


Recommended