HPC-SIG Ecosystem Validation Renato Golin Baptiste ... · Vendors to rely on Linaro for base OSS...

Post on 06-Aug-2020

1 views 0 download

transcript

HPC-SIG Ecosystem ValidationJan. 14 2019

Baptiste GerondeauRenato Golin

For more info visitlinaro.org/hpc

HPC-SIG Lab and Validation Matrix

Aggregate machines in the same infrastructure, and validate their performance using a Validation Matrix

● Validation Matrix must be applicable to every machine● Validation Matrix dimensions are software configurations

To generate as few tests as possible, we need to simplify the matrix without losing information

For more info visitlinaro.org/hpc

HPC-SIG Lab’s Infrastructure

The infrastructure needs to :● Dispatch jobs (tests, provisioning, benchmarks)● Provide DHCP/TFTP services● Provide Package Cache services● Provide a secure file/results storage service● Be Low Maintenance● Be able to be replicated anywhere else

For more info visitlinaro.org/hpc

Simplifying InfrastructureIdentifying the different dimensions

A Vertical Slice of the Stack

Principal dimensions :➔ Application➔ HPC environment stack➔ Machine provisioning

● HPC Stack : OpenHPC

● Validation Application : OpenHPC’s testsuite

For more info visitlinaro.org/hpc

Simplifying InfrastructureIdentifying the different dimensions

The Stack from the Lab’s point of view

Machine provisioning :

➔ Network configuration➔ Kernel➔ OS➔ HPC Stack

● Multiple ways to do the provisioning

For more info visitlinaro.org/hpc

Simplifying InfrastructureIdentifying the different dimensions

Provisioning Method Variations

Multiple ways to provision :

➔ Warewulf Stateless (VNFS)➔ Warewulf Stateful (OS image)➔ Ansible

For more info visitlinaro.org/hpc

Simplifying InfrastructureIdentifying the different dimensions

Different Network Layouts

● Flat : Machines reachable from anywhere

● Tree: Machines reachable from cluster head node only

● Root : Master with DHCP/TFTP server

For more info visitlinaro.org/hpc

Simplifying InfrastructureIdentifying the different dimensions

Different Kernels

● Upstream from OS

● ERP : Entreprise Reference Platform

● Contains support for platforms in the process of being upstreamed

For more info visitlinaro.org/hpc

Simplifying InfrastructureIdentifying the different dimensions

Different Operating Systems

● 3 OSes available to the user

● No Debian support in OpenHPC

For more info visitlinaro.org/hpc

Simplifying InfrastructureAbstractions, and the user’s environment

Abstracting Network Variations

● Invisible to the user● Handled by the lab installer● Dependent on hardware

For more info visitlinaro.org/hpc

Simplifying InfrastructureAbstractions, and the user’s environment

Abstracting Provisioning Variations

● Multi-staged provisioning● Coexistence● Dependent on hardware

For more info visitlinaro.org/hpc

Simplifying InfrastructureAbstractions, and the user’s environment

Abstracting Environment Variations

● Control over HPC Stack● Common OS configuration● Idempotency● Package Caches

For more info visitlinaro.org/hpc

Simplifying InfrastructureAbstractions, and the user’s environment

Accounting for extra HPC services

● Infiniband Support● Lustre server support● Future additional features

(additional hardware)

For more info visitlinaro.org/hpc

Simplifying InfrastructureWhat the User sees, configures

The Lab’s Interface

➔ Choose Application

❖ Lab picks default configuration❖ User fine tunes configuration

For more info visitlinaro.org/hpc

Validation matrixCluster Deployment

For more info visitlinaro.org/hpc

Validation matrixDistributed Applications Enablement

For more info visitlinaro.org/hpc

Validation matrixToolchain Benchmarking

For more info visitlinaro.org/hpc

Validation matrixLibrary Enablement and Enhancement

For more info visitlinaro.org/hpc

Future● Vendors to rely on Linaro for base OSS validation

○ We have multiple vendors available○ On a standardised infrastructure

For more info visitlinaro.org/hpc

Future● Vendors to rely on Linaro for base OSS validation

○ We have multiple vendors available○ On a standardised infrastructure

● Share our work○ OpenHPC Ansible recipes (with the OpenHPC community)○ SDI (MrP, Jenkins, Ansible) helping members to replicate our work○ Community CI (OpenHPC test-suite, MPI MTT, OpenMP tests, OpenBLAS CI)

For more info visitlinaro.org/hpc

Future● Vendors to rely on Linaro for base OSS validation

○ We have multiple vendors available○ On a standardised infrastructure

● Share our work○ OpenHPC Ansible recipes (with the OpenHPC community)○ SDI (MrP, Jenkins, Ansible) helping members to replicate our work○ Community CI (OpenHPC test-suite, MPI MTT, OpenMP tests, OpenBLAS CI)

● Allow our engineers to develop the ecosystem○ Internal tests and benchmarks (via Jenkins, no infrastructure knowledge needed)○ Testing new packages, libraries, compilers (comparison jobs, CI results, statistic analysis)

For more info visitlinaro.org/hpc

Future● Vendors to rely on Linaro for base OSS validation

○ We have multiple vendors available○ On a standardised infrastructure

● Share our work○ OpenHPC Ansible recipes (with the OpenHPC community)○ SDI (MrP, Jenkins, Ansible) helping members to replicate our work○ Community CI (OpenHPC test-suite, MPI MTT, OpenMP tests, OpenBLAS CI)

● Allow our engineers to develop the ecosystem○ Internal tests and benchmarks (via Jenkins, no infrastructure knowledge needed)○ Testing new packages, libraries, compilers (comparison jobs, CI results, statistic analysis)

HPC Lab Setuphttps://github.com/Linaro/hpc_lab_setup

Ansible OpenHPC installation recipe : https://github.com/Linaro/ansible-playbook-for-ohpc

Thanks!