Mainframe Day 2017 Next Generation Memory for BS2000 as well · Title: BS2000 and PMEM Author:...

Post on 07-Jun-2020

0 views 0 download

transcript

0 FUJITSU INTERNAL Copyright 2017 FUJITSU

Mainframe Day 2017 Next Generation Memory – for BS2000 as well

Dieter.Kasper@ts.fujitsu.com

Fujitsu Distinguished Engineer

CTO Enterprise Platform Services

2017-01-25 v4

1 FUJITSU INTERNAL Copyright 2017 FUJITSU

PMEM – Next Generation Memory

2 FUJITSU INTERNAL Copyright 2017 FUJITSU

Copyright 2016 FUJITSU

= Optane PCIe/NVMe

3 FUJITSU INTERNAL Copyright 2017 FUJITSU

Checking the Marketing numbers

Module Intel latency

factor ~

SRAM 1

DRAM 10

3D-XPoint 100

NAND 100.000

HDD 10.000.000

Latency

2-3 ns

20-35 ns

~ 250 ns

~ 80 µs

~ 5 ms

Intel size

factor ~

1

100

1.000

1.000

10.000

Capacity

~ 60 MB

~ 64 GB

~ 512 GB

~ 16 TB

~ 6 TB

Atomic

granularity

64 B

64 B

64 B

4096 B

512 / 4096 B

4 FUJITSU INTERNAL Copyright 2017 FUJITSU

Latency translated in Distance

Copyright 2016 FUJITSU

100 cm

1 km 100 km

1000 cm

10 cm

1 cm

5 FUJITSU INTERNAL Copyright 2017 FUJITSU

6 FUJITSU INTERNAL Copyright 2017 FUJITSU

7 FUJITSU INTERNAL Copyright 2017 FUJITSU

8 FUJITSU INTERNAL Copyright 2017 FUJITSU

What does this mean ?

NVMe Block Interface NVM-Libraries & Drivers

9 FUJITSU INTERNAL Copyright 2017 FUJITSU

10 FUJITSU INTERNAL Copyright 2017 FUJITSU

I/O

with

OS

Buffer

cache

I/O

with

CPU

Lx

cache

3D-Xpoint DIMM Software Architecture

3D-Xpoint DIMMs

11 FUJITSU INTERNAL Copyright 2017 FUJITSU

The Data Path

Core

L1

L2

L1

L3

Core

L1

L2

L1

Core

L1

L2

L1

Core

L1

L2

L1

Memory Controller

NV-DIMM / PMEM NV-DIMM / PMEM

Memory Controller

NV-DIMM / PMEM NV-DIMM / PMEM

MOV

12 FUJITSU INTERNAL Copyright 2017 FUJITSU

The Data Path

Core

L1

L2

L1

L3

Core

L1

L2

L1

Core

L1

L2

L1

Core

L1

L2

L1

Memory Controller

NV-DIMM / PMEM NV-DIMM / PMEM

Memory Controller

NV-DIMM / PMEM NV-DIMM / PMEM

MOV

CLFLUSH

CLFLUSHOPT

CLWB

PCOMMIT

13 FUJITSU INTERNAL Copyright 2017 FUJITSU

The Data Path

Core

L1

L2

L1

L3

Core

L1

L2

L1

Core

L1

L2

L1

Core

L1

L2

L1

Memory Controller

NV-DIMM / PMEM NV-DIMM / PMEM

Memory Controller

NV-DIMM / PMEM NV-DIMM / PMEM

MOV

CLFLUSH

CLFLUSHOPT

CLWB

ADR = Flush the

WPQ automatically on

power-fail or shutdown

15 FUJITSU INTERNAL Copyright 2017 FUJITSU

Example Code

MOV X1, 10

MOV X2, 20 X2, X1 are in PMEM

.

MOV R1, X1 Stores to X1 and X2 are globally

… visible, but may not be persistent

CLFLUSHOPT X1

CLFLUSHOPT X2 X1 and X2 moved from caches to memory

SFENCE

PCOMMIT ensures PCOMMIT has completed ADR

16 FUJITSU INTERNAL Copyright 2017 FUJITSU

What does this mean ?

NVM Libraries (optional)

17 FUJITSU INTERNAL Copyright 2017 FUJITSU

original libart tree init routine

int art_tree_init(art_tree *t) {

t->root = NULL;

t->size = 0;

return 0;

}

18 FUJITSU INTERNAL Copyright 2017 FUJITSU

libart tree init routine … ported to PMEM

int art_tree_init(PMEMobjpool *pop, int *newpool)

{

int errors = 0;

TOID(struct art_tree_root) root;

if (pop == NULL) { errors++; }

if (!errors) {

TX_BEGIN(pop) {

root = POBJ_ROOT(pop, struct art_tree_root);

if (*newpool) {

TX_ADD(root);

D_RW(root)->root.oid = OID_NULL;

D_RW(root)->size = 0;

*newpool = 0;

}

} TX_END

}

return(errors);

}

19 FUJITSU INTERNAL Copyright 2017 FUJITSU

original libart art_insert routine

void*

art_insert(art_tree *t, const unsigned char *key, int key_len, void *value)

{

int old_val = 0;

void *old = recursive_insert(t->root, &t->root, key, key_len, value, 0, &old_val);

if (!old_val) t->size++;

return old;

}

20 FUJITSU INTERNAL Copyright 2017 FUJITSU

libart art_insert routine … ported to PMEM

TOID(var_string)

art_insert(PMEMobjpool *pop, const unsigned char *key, int key_len, void *value, int val_len)

{

int old_val = 0;

TOID(var_string) old;

TOID(struct art_tree_root) root;

TX_BEGIN(pop) {

root = POBJ_ROOT(pop, struct art_tree_root);

TX_ADD(root);

old = recursive_insert(pop, D_RO(root)->root, &(D_RW(root)->root), (const unsigned

char *)key, key_len, value, val_len, 0, &old_val);

if (!old_val) D_RW(root)->size++;

} TX_ONABORT {

abort();

} TX_END

return old;

}

21 FUJITSU INTERNAL Copyright 2017 FUJITSU

My own experience - Summary

To get the maximum value our of it explicit changes in the

Application are necessary, but they pay back (factor ~1000)

Debug support is missing

Architecture Check before adapting an APP

Does it have the right structure ?

Where are adaptions necessary ?

Need optimized platform interconnects to create HA Storage

Still space for optimization in the end-to-end Software Stack

22 FUJITSU INTERNAL Copyright 2017 FUJITSU

Next Steps

Switch from SEP to real prototype platform with Skylake-SP

(RX2540-M4) and AEP in early 2017

Use Intel Parallel Studio for analyze & debug of PMEM

(1) App-Direct

Optimize NVM-Libs: propose measures to the reduce the overhead in the

Transactional logic to achieve optimized software storage access methods.

(2) Memory

Use Memkind-Lib: look for Algorithms for a new HMM / Hierarchical-Memory-

Management

(3) Remote access to PMEM in App-Direct & Memory mode

23 FUJITSU INTERNAL Copyright 2017 FUJITSU

BS2000 – a possible future outlook

Please contact us via email on 1st page if you are interested in more details

24 FUJITSU INTERNAL Copyright 2017 FUJITSU

ESA/390

25 FUJITSU INTERNAL Copyright 2017 FUJITSU

BS2000 functional layer /390

TU

TPR

SIH

HW

User Applications

Commands

Job control

SYSFILE Mgmt

SPOOL & RSO

Data transmission access

(TIAM, DCAM, UTM)

Programs and Macros

Catalog Mgmt

Device Mgmt

Media Mgmt

Access Methods

Dynamic & Static loader

Accounting

Test helps

Sub-System Mgmt

Logging

Address Mgmt

Memory Mgmt

Paging

Task control

Process Mgmt

Reconfiguration

Transport services for

remote data transmission

(BCAM)

Physical I/O

Start I/O Interrupt handler

Task initiator

Assign CPU to task

CPU Interrupts Function, Control Register

SVC = SystemCalls Sub-Systems

26 FUJITSU INTERNAL Copyright 2017 FUJITSU

BS2000 functional layer x86

TU

TPR

SIH

HW

User Applications

Commands

Job control

SYSFILE Mgmt

SPOOL & RSO

Data transmission access .

(TIAM, DCAM, UTM)

Programs and Macros

Catalog Mgmt

Device Mgmt

Media Mgmt

Access Methods

Dynamic & Static loader

Accounting

Test helps

Sub-System Mgmt

Logging

Address Mgmt

Memory Mgmt

Paging

Task control

Process Mgmt

Reconfiguration

Transport services for

remote data transmission

(BCAM)

Physical I/O

Start I/O Interrupt handler

Task initiator

Assign CPU to task

CPU Interrupts Function, Control Register

/390-

CPU

Emul

ation

&

JIT

SVC = SystemCalls Sub-Systems

27 FUJITSU INTERNAL Copyright 2017 FUJITSU

BS2000 and VME kickin’ & alive OS development in Europe

Use of latest HW and SW technologies

Fascinating tasks and exciting missions

Support young researchers with traineeships and master / PhD thesis

28 FUJITSU INTERNAL Copyright 2017 FUJITSU