+ All Categories
Home > Documents > Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu...

Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu...

Date post: 21-Mar-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
32
Connect. Challenge. Inspire. All Rights Reserved, Copyright© FUJITSU LIMITED 2015 ADAC Japan 2017 Jan 25 th , 2017 Fujitsu processor history and future Takumi Maruyama Senior Director AI Platform Division Advanced System Research & Development Unit
Transcript
Page 1: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Connect. Challenge. Inspire.

All Rights Reserved, Copyright© FUJITSU LIMITED 2015

ADAC Japan 2017

Jan 25th, 2017

Fujitsu processor history and future

Takumi Maruyama Senior Director AI Platform Division Advanced System Research & Development Unit

Page 2: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Agenda

K computer

Fujitsu processor development history

HPC

UNIX/Mainframe

Future development plan

Post K

AI processor: DLU

Summary

1 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 3: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

RIKEN K computer

WR#1 10.51 PFlops (2011/11)

Page 4: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

K computer “Still the best”

3 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

2011 2012 2013 2014

1.TOP500 List

2 4

2. Gordon Bell Prize

3. HPC Challenge Awards (HPC、Random Access、STREAM、FFT)

2015

4. Graph500

4 4

2016

Finalist

7

4 2

Page 5: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

4 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

(Source: ISC 2015 Long term failure analysis of 10 petascale supercomputer, RIKEN)

K computer “High reliable”

CPU failure rates of the K computer are about quarter compared to that of Blue Waters.

Page 6: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

High Performance Processor

8core

Liquid Cooling

4processor

Torus network

6D.

「京」は、理化学研究所が2010年7月から使用している「次世代スーパーコンピュータ」の愛称です。

Fujitsu technologies to realize K computer

864 racks

82,944 Compute nodes

5,184 IO nodes

High density rack

24board ©Riken

5 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 7: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Fujitsu processor development History - HPC - UNIX/GS

6 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 8: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Fujitsu Processor development

Perpetual evolution, Always targeting No.1

2000~2003

SPARC64

SPARC64 II

SPARC64 V

SPARC64

GP

GS8900

GS21 600

GS8600

GS8800B

SPARC64 VII

GS21 1600

SPARC64 V+

SPARC64 VI

GS8800

GS21 900

Mainframe

Perfo

rman

ce

Relia

bility

Store Ahead

Branch History

Prefetch

Single-chip CPU

Non-Blocking $

O-O-O Execution

Super-Scalar

L2$ on Die

HPC-ACE

System on Chip

Hardware Barrier

Multi-core Multi-thread

2004~2007 2008~2011

SPARC64

GP

2012~2015 2016~

SPARC64 IXfx

SPARC64 VIIIfx

Virtual Machine Architecture

Software on Chip

High-speed Interconnect

SPARC64 X+

130nm

250nm / 220nm

180nm

:Technology generation

90nm

350nm

28nm

Tr=1B CMOS Cu

40nm

65nm

HPC UNIX

$ ECC

Register/ALU Parity

Instruction Retry

$ Dynamic Degradation

Error Checkers/History

Mainframe/UNIX/HPC + AI incremental development

GS21 2600

45nm

40nm

Next GS

SPARC64 XIfx

SPARC64

X

20nm

DLU

SPARC64 XII

Post-K ARM

AI

7 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 9: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

SPARC64™ VII

• Architecture Features • 4core x 2threads (SMT)

• Embedded 6MB L2$

• 2.5GHz

• Jupiter Bus

• Fujitsu 65nm CMOS • 21.31mm x 20.86mm

• 600M transistors

• 456 signal pins

• 135 W (max) • 44% power reduction per core

from SPARC64TM VI

Core 1Core 1 Core 3Core 3

Core 2Core 2

System I/OSystem I/O

System

Interface

System

Interface

L2 Cache

Control

L2 Cache

Control

L2 Cache

Tag

L2 Cache

Tag

Instruction

Control

Instruction

Control

Execution

Unit

Execution

UnitL1D CacheL1D Cache

L1 Cache

Control

L1 Cache

Control

L2 Cache

Data

L2 Cache

Data

L1I CacheL1I Cache

8 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 10: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

SPARC64™ VII in HPC & UNIX server

Core 1Core 1 Core 3Core 3

Core 2Core 2

System I/OSystem I/O

System

Interface

System

Interface

L2 Cache

Control

L2 Cache

Control

L2 Cache

Tag

L2 Cache

Tag

Instruction

Control

Instruction

Control

Execution

Unit

Execution

UnitL1D CacheL1D Cache

L1 Cache

Control

L1 Cache

Control

L2 Cache

Data

L2 Cache

Data

L1I CacheL1I Cache

JSC

SPARC64TM

VII JSC DDR2

DDR2

32B

32B 20B(in)+12B(out)

SC

SC

MC

MC

MC

MC

CPU

CPU

CPU

CPU

DDR2

DDR2

DDR2

DDR2

SC

SC

Supercomputer FX1

System board diagram System board diagram

SPARC Enterprise

The same processor was used both in HPC and UNIX severs - SC(system controller LSI) was different

9 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 11: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

SPARC64™ VIIIfx Chip [HPC]

• Architecture Features • 8 cores

• HPC-ACE (128-bit SIMD)

• Shared 6 MB L2$

• Embedded Memory Controller

• 2 GHz

• Fujitsu 45nm CMOS • 22.7mm x 22.6mm

• 760M transistors

• 1271 signal pins

• Performance (peak) • 128GFlops

• 64GB/s memory throughput

• Power • 58W (TYP, 30℃)

• Water Cooling – Low leakage power and High reliability

Core5

Core4

Core1

Core0

Core7

Core6

Core3

Core2

DD

R3 inte

rface

DD

R3 inte

rface

L2$ Data

L2$ Data

HSIO

L2$ Control MAC MAC

MAC MAC

10 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 12: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

SPARC64™ X+ Chip [UNIX]

Architecture Features • 16 cores x 2 SMT threads

• Shared 24 MB L2$

• Memory and I/O Controllers

• HPC-ACE (128bit SIMD)

• SWoC (Software on Chip)

28nm CMOS • 24.0mm x 25.0mm

• 2,990M transistors

• 1,500 signal pins

• 3.7GHz

Performance (peak) • 473GFlops

• 102GB/s memory throughput

DDR3 Interface

DDR3 Interface

Core Core

Core Core

Core Core

Core Core

Core Core

Core Core

Core Core

Core Core

L2 Cache

Data

L2 Cache

Data

L2 Cache

Control

SE

RD

ES

/ Inte

r-CP

U

SE

RD

ES

/ PC

Ie G

en

3

MAC

MAC

11 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 13: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

SPARC64™ XIfx Chip [HPC]

Architecture Features • 32 computing cores

+ 2 assistant cores

• HPC-ACE2 (256bit SIMD)

• 24 MB L2 cache

• HMC, Tofu2 , PCI Gen3

20nm CMOS • 3,750M transistors

• 1,001 signal pins

• 2.2GHz

Performance (peak) • 1.1TFlops

• HMC 240GB/s x 2(in/out)

• Tofu2 125GB/s x 2(in/out)

core core

core core

core core

core core

core core

core core

core core

core core

Assistant

core Assistant

core

core core

core core

core core

core core

core core

core core

core core

core core

Tofu2 interface

Tofu2 controller

HM

C in

terface

HM

C inte

rfac

e

L2 cache

L2 cache

PCI interface

MA

C

MA

C M

AC

M

AC

PCI controller

12 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 14: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

13

HPC-ACE @SPARC64™ VIIIfx (High Performance Computing - Arithmetic Computational Extensions)

Fujitsu’s unique ISA extension to SPARC-V9 for HPC

• Large register sets

• 128bit SIMD

• Software controlled Cache

• FP Trigonometric Functions

• Conditional operation

• FP Reciprocal Approximation of Divide/Square-root

SPARC architecture is tolerant about ISA enhancements.

Ext.

V9

Ext.

V9 V9

Ext.

INT Reg FP Reg

Register Window

32

224

32

32 160

SIMD (basic)

SIMD (extended)

32

Register extension to SPARC-V9

All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 15: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

14

Software on Chip @SPARC64™ X+

HW for SW ISA extensions to accelerates specific software function with HW

The targets Decimal operation (IEEE754 decimal and NUMBER)

Cypher operation (AES/DES)

Database acceleration

HW implementation The HW engines for SWoC are implemented in FPU

• To fully utilize 128 FP registers & software pipelining

Implemented as instructions rather than dedicated co-processor to maximize flexibility of SW.

Avoid complication due to “CISC” type instructions

• Various “RISC” type instructions are newly defined, instead.

• 18 insts. for Decimal, and 10 insts. for Cypher operation

All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 16: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

SPARC64TM VII Pipeline [HPC/UNIX]

L1 I$ 64KB

2Way

Branch Target

Address 8Kentry

Decode

& Issue

RSE 8x2Entry

RSA 10Entry

RSF 8x2Entry

RSBR 10Entry

GUB 32Registers

GPR 156Registers

x2

EXA

EXB

EAGA

EAGB

FPR 64Registers

x2

FUB 48Registers

FLA

FLB

Fetch

Port 16Entry

Store

Port 16Entry

Store

Buffer 16Entry

L1 D$ 64KB

2Way

System Bus

Interface

Fetch Issue Dispatch Reg.-Read Execute Memory

CSE 64Entry

Commit

PC

x2

Control

Registers

x2

L2$ 6MB/12MB 12Way

4-core

All Rights Reserved, Copyright 2017 FUJITSU LIMITED 15

Page 17: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

16

L1 I$ 32KB

2Way

Branch Target

Address 1Kentry 2Way

Decode

& Issue

RSE 10Entry

RSA 10Entry

RSF 8x2Entry

RSBR 6Entry

GUB 32Registers

GPR 188Registers

EXA

EXB

EAGA

EAGB

FPR 256Registers

FUB 48x2Registers

FLA

FLB

Fetch

Port 20Entry

Store

Port 8Entry

L1 D$ 32KB

2Way

Memory

Controller

Fetch Issue Dispatch Reg.-Read Execute Memory

CSE 48Entry

Commit

PC

Control

Registers

SPARC64TM VIIIfx Pipeline [HPC]

L2$ 6MB

12Way FLC

FLD

DIMM

Write

Buffer 5Entry

All Rights Reserved, Copyright 2017 FUJITSU LIMITED

8-core …

Page 18: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Control

Registers

PC

FPR 128Registers

GPR 156Registers

L1 I$ 64KB

4Way

Branch Target

Address 4Kentry

Decode

& Issue

RSE 24Entry

RSA 24Entry

RSF 20Entry

RSBR 16Entry

GUB 64Registers

GPR 156Registers

EXA

EXB

EAGA

EXC

EAGB

EXD

FPR 128Registers

FUB 64Registers

FLA

Decimal

Cypher

FLB

Fetch

Port 32Entry

Store

Port 24Entry

L1 D$ 64KB

4Way

Memory

Controller

Fetch Issue Dispatch Reg.-Read Execute Memory

CSE 96Entry

Commit

PC

Control

Registers

SPARC64TM X Pipeline [UNIX]

L2$ 24MB 24Way

FLC

Cypher

FLD

DIMM

Write

Buffer 10Entry

Pattern History Table

16Kentry

IO

Controller Router

PCI-GEN3 CPU-CPU I/F

16-core …

All Rights Reserved, Copyright 2017 FUJITSU LIMITED 17

Page 19: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

SPARC64TM XIfx Pipeline [HPC]

FLB

L1 I$ 64KB

4ways

Branch Target

Address

Decode

& Issue

RSE

RSA

RSF

RSBR

GUB

GPR 188Registers

EXA

EXB

EAGA

EXC

EAGB

EXD

FPR 128x4 Reg.

FUB

Fetch

Port

Store

Port L1 D $

64KB

4Way

MAC

(HMC Controller)

Fetch Issue Dispatch Reg-Read Execute Cache and Memory

CSE

Commit

PC

Control

Registers

L2$

HMC

Write

Buffer

Pattern History Table

PCI Controller

Tofu 2 controller

PCI-GEN3 CPU-CPU I/F

34 cores …

FLB FLA

Local Pattern Table

FLB FLB FLB

All Rights Reserved, Copyright 2017 FUJITSU LIMITED 18

Page 20: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Fujitsu’s processor design approach

Fully utilize the latest semiconductor technology

Enhance/change ISA (Instruction Set Architecture) to meet requirements: HPC-ACE, SWoC

Shared micro-Architecture across HPC/UNIX/GS

Perpetual evolution over generations

19 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 21: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Future Fujitsu processor development - Post K - AI Processor (DLUTM)

20 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 22: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Post-K Goals and Approaches

Post-K Goals

High application performance and good power efficiency

Keeping application compatibility while advancing from predecessors

Good usability and better accessibility for users

Our Approaches

Developing high performance and scalable, custom CPU cores

【Performance】 Wider SIMD & high memory BW, mathematical acc. Primitives

【Scalability】 Scalable many core, zero OS jitter (assistant core)

【Power efficiency】 The best device tech, power control functions, optimal resources

Maintaining performance balance and supporting advanced features

• High memory BW, “Tofu” interconnect, and RIKEN advanced system software

Adopting ARM standard architecture

• Co-operation with ARM/Linux community and utilization of open source software

• Getting involved in the ARM HPC ecosystem

All Rights Reserved, Copyright 2017 FUJITSU LIMITED 21

Page 23: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Post-K CPU ISA: ARM SVE

FUJITSU, as a lead partner in ARM SVE development, contributes to specification of ARM SVE (Scalable Vector Extension), for application performance

FUJITSU ARM core incorporates FUJITSU’s proven supercomputer microarchitecture

ARM SVE, plus optional functions and Tofu, maintain programing models and performance balance

Post-K complies ARM’s standard frameworks (SBSA, etc.), for compatibility among platforms

Functions for Perf. Post-K FX100 FX10 K computer

SVE incorporated

SIMD 512bit 256bit 128bit 128bit

FMA4 ✔ ✔ ✔ ✔

Math. acc. prim.* ✔Enhanced ✔ ✔ ✔

Optional functions

Inter-core barrier ✔ ✔ ✔ ✔

Sector cache ✔Enhanced ✔ ✔ ✔

Prefetch modes ✔Enhanced ✔ ✔ ✔

Interconnect Tofu ✔Enhanced ✔ ✔ ✔

*Mathematical acceleration primitives include trigonometric functions, sine & cosines, and exponential... 22 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 24: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Supercomputer K technologies

DLUTM features Architecture designed for Deep Learning High performance HBM2 memory Low power design ➔ Goal: 10x Performance/Watt compared to others

Massively parallel:Apply the supercomputer interconnect technology ➔Ability to handle large scale neural networks

写真はイメージであり、実物とは異なります

DLUTM Goals

DLU

(Deep Learning Unit)

FY2018~

TM

23 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 25: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

New ISA for Deep learning High density FMA

HBM2

on-chip network

DPU: Deep learning Processing Unit, DPE: Deep learning Processing Element

Host I/F DPU-0

DPU-1

DPU

DPU

DPU

DPU-n

DPE DPE DPE

DPE DPE DPE DPE DPE DPE

DPE DPE DPE

DPE DPE DPE

Large scale DLU interconnect through off-chip network

DPE DPE DPE

DLUTM

(Deep Learning Unit)

DLUTM architecture

Fujitsu’s interconnect technology

24

ISA: Newly developed for Deep learning

Micro-Architecture

Simple pipeline to remove HW complexity

On chip network to share data between DPUs

Utilize Fujitsu’s HPC experience such as high density FMA and

high speed interconnect

➔ Maximize performance(throughput) / watt

All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 26: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Multiple generations of DLUs over time, as we currently do for HPC/UNIX/Mainframe processors.

All Rights Reserved, Copyright 2017 FUJITSU LIMITED

DLUTM Future Roadmap

FUJITSU CONFIDENTIAL

Performance

per watt

×10

Fujitsu

• K computer processor development technology

• Architecture optimized for DL

• Large scale network

Accelerator Needs separate Host CPU The 1st

Generation

Host CPU embedded Inter-DLU direct connection

Large scale neural network

The 2nd

Generation

Neuro computing Combinational optimization

architecture. Future

FY2021 FY2018

25

* Subject to change without notice

Page 27: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

New Arch. Rivals Quantum Computers

All Rights Reserved, Copyright 2017 FUJITSU LIMITED

New Architecture Architecture designed for combinatorial optimization problems

Uses a basic optimization circuit, based on digital circuitry and conventional semiconductor technology

Hierarchical structure for optimal data movement and parallel calculation

Acceleration Technology Calculates multiple candidates at once, parallel execution

Detects and escapes from the local minimum states by adding score

New architecture Acceleration using basic optimization circuit http://www.fujitsu.com/global/about/resources/news/press-releases/2016/1020-02.html

26

Page 28: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Summary

27 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 29: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

FJ Processors 2015 2012 2010

2008

K Computer 8core

Post “京”

FX1 4core

FX10 16core

FX100 34core DLU

2013 M10

16core

SPARC Enterprise

Next

GS

2014

GS21 model 2600 8core

AI

HPC

UNIX

SPARC64TM

VII

65nm

SPARC64TM

VIIIfx

45nm

SPARC64TM

Ixfx

40nm

SPARC64TM

XIfx

20nm

SPARC64TM

X

28nm SPARC64TM

XII

Mainframe

SPARC64TM

VII+

65nm

2011 SPARC Enterprise

4core

28nm

Utilization of the latest Semiconductor technologies

Evolution to meet different requirements between HPC/UNIX/Mainframe/AI

28 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 30: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

29 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

10

100

1,000

10,000

0 1 2 3 4

GF

lop

s

GHz

Single thread performance

[Amdahl’s law]

Throughput

UNIX

processor

HPC

processor

AI

processor

DLU

SPARC64TM

VII

An example of different requirements:

Single thread performance vs Throughput

SPARC64TM

X+

SPARC64TM

VIIIfx

SPARC64TM

IXfx

SPARC64TM

XIfx

Page 31: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Summary

Fujitsu has successfully designed various processors for decades.

Fujitsu’s processor design win has come from Instruction set architecture (ISA) enhancements

Shared micro-architecture with perpetual evolution over generations

Semiconductor technology improvements

Fujitsu will take similar design approach for Post-K and DLU ISA and micro-architecture are getting more important due to the

limitation of Moore’s law.

Fujitsu will continue to develop processors to meet the needs of a new era.

30 All Rights Reserved, Copyright 2017 FUJITSU LIMITED

Page 32: Fujitsu processor history and future · • Embedded 6MB L2$ • 2.5GHz • Jupiter Bus • Fujitsu 65nm CMOS • 21.31mm x 20.86mm • 600M transistors • 456 signal pins • 135

Recommended