+ All Categories
Home > Documents > ANSYS HPC · ANSYS HPC High Performance ... network switches Improves performance, particularly on...

ANSYS HPC · ANSYS HPC High Performance ... network switches Improves performance, particularly on...

Date post: 18-May-2018
Category:
Upload: nguyenthuan
View: 220 times
Download: 2 times
Share this document with a friend
32
© 2011 ANSYS, Inc. 8/29/11 1 ANSYS HPC High Performance Computing Leadership Barbara Hutchings [email protected]
Transcript

© 2011 ANSYS, Inc. 8/29/111

ANSYS HPC High Performance Computing Leadership

Barbara [email protected]

© 2011 ANSYS, Inc. 8/29/112

Why HPC for ANSYS? Insight you can’t get any other way

HPC enables high-fidelity • Solve the un-solvable• Be sure your design is

“right”• Innovate with confidence

HPC delivers throughput• Consider multiple design

ideas• Optimize your design• Ensure performance across

range of conditions

© 2011 ANSYS, Inc. 8/29/113

HPC has become a software issue.

• Clock Speed – Leveling off

• Core Counts – Growing

• Exploding (GPUs)

• Future performance depends on highly scalable parallel software

• ANSYS goal: lead the way into this future.

Source: http://www.lanl.gov/news/index.php/fuseaction/1663.article/d/20085/id/13277

© 2011 ANSYS, Inc. 8/29/114

HPC Deployment - Trends / Challenges

Local computing infrastructure for simulation is being replaced by centralized HPC resources, shared by a globally distributed workforce.

• Remote access, scheduling, and visualization tools

• Data sharing, IP protection, central data archiving

“Mega” Simulations for high-fidelity understanding

Design Exploration for improved product integrity

• (Intermittent) need for 1000’s of processors, terabytes of RAM

• Storage / data management

“Private Cloud” strategies for infrastructure

• Remote or on-site, 3P managed• Elastic capacity, rapid deployment, service

level agreements, CAPEX vs. OPEX

© 2011 ANSYS, Inc. 8/29/115

Performance / Software Milestones

• Scalability, GPUsCurrent HPC Practice

• Customer case studiesDeployment Trends

• ANSYS vision

Agenda

© 2011 ANSYS, Inc. 8/29/116

0 256 512 768 1024 1280 15360

2

4

6

8

10

12

2010 Hardware(Intel Westmere, QDR IB)

Number of Cores

RATING

0 2 4 6 8 10 120

2

4

6

8

10

12

2008 Hardware (Intel Harpertown, DDR IB)

Number of Cores

RA

TIN

G

Systems keep improving: faster processors, more cores• Ideal rating (speed) doubled in two years!

Memory bandwidth per core and network latency/BW stress scalability

• 2008 release (12.0) re-architected MPI – huge scaling improvement, for a while…

• 2010 release (13.0) introduces hybrid parallelism – and scaling continues!

ANSYS FLUENT Scaling Achievement

© 2011 ANSYS, Inc. 8/29/117

Extreme CFD Scaling - 1000’s of cores

Enabled by ongoing software innovationHybrid parallel: fast shared memory

communication (OpenMP) within a machine to speed up overall solver performance; distributed memory (MPI) between machines

© 2011 ANSYS, Inc. 8/29/118

Parallel Scaling ANSYS Mechanical

0 64 128 192 2560

50

100

150

200

250

300 Sparse Solver (Parallel Re-Ordering)

Number of cores

Sol

utio

n R

atin

g

0 16 32 48 640

500

1000

1500

2000

2500

3000

3500

4000 PCG Solver (Pre-Conditioner Scaling)

Number of cores

Sol

utio

n R

atin

g

Focus on bottlenecks in

the distributed memory

solvers (DANSYS)

– Sparse Solver• Parallelized

equation ordering

• 40% faster w/ updated Intel MKL

– Preconditioned Conjugate Gradient (PCG) Solver• Parallelized

preconditioning step

© 2011 ANSYS, Inc. 8/29/119

Architecture-Aware Partitioning

Original partitions are remapped to the cluster considering the network topology and latencies

Minimizes inter-machine traffic reducing load on network switches

Improves performance, particularly on slow interconnects and/or large clusters

Partition Graph3 machines, 8 cores eachColors indicate machines

Original mapping New mapping

© 2011 ANSYS, Inc. 8/29/1110

File I/O PerformanceCase file IO• Both read and write

significantly faster in R13• A combination of serial-IO

optimizations as well as parallel-IO techniques, where available

Parallel-IO (.pdat)• Significant speedup of parallel

IO, particularly for cases with large number of zones

• Support for Lustre, EMC/MPFS, AIX/GPFS file systems added

Data file IO (.dat)• Performance in R12 was highly

optimized. Further incremental improvements done in R13

Parallel Data write R12 vs. R13

BMW -68%

FL5L2 4M -63%

Circuit -97%

Truck 14M -64%

truck_14m, case read

© 2011 ANSYS, Inc. 8/29/1111

GPU Computing!

CPUs and GPUs work in a collaborative fashion

Multi-core processors•Typically 4-6 cores•Powerful, general

purpose

Many-core processors•Typically hundreds of cores•Great for highly parallel code,

within memory constraints

CPU GPU

PCI Express channel

© 2011 ANSYS, Inc. 8/29/1112

SolverKernel Speedu

ps

OverallSpeedup

s

From NAFEMS

World Congress May 2011

Boston, MA, USA

“Accelerate FEA Simulations with

a GPU”-by Jeff

Beisheim, ANSYS

ANSYS Mechanical SMP – GPU Speedup

Tesla C2050 and Intel Xeon 5560

© 2011 ANSYS, Inc. 8/29/1113

•Windows workstation : Two Intel Xeon 5560 processors (2.8 GHz, 8 cores total), 32 GB RAM, NVIDIA Tesla C2070, Windows 7, TCC driver mode

New: GPU Acceleration for DANSYS

0

1

2

3 R14 Distributed ANSYS Total Simulation Speedups for R13 Benchmark set

© 2011 ANSYS, Inc. 8/29/1114

0

2

4

6

8

10

R14 Distributed ANSYS w/wo GPU

Tota

l Spe

edu

p

ANSYS Mechanical – Multi-Node GPU

Mold

PCB

Solder balls

Results Courtesy of MicroConsult Engineering, GmbH

• Solder Joint Benchmark (4 MDOF, Creep Strain Analysis)

Linux cluster : Each node contains 12 Intel Xeon 5600-series cores, 96 GB RAM, NVIDIA Tesla M2070, InfiniBand

© 2011 ANSYS, Inc. 8/29/1115

First capability for physics outside core linear solver, with modest memory requirement: view factors, ray tracing, reaction rates, etc.

R&D focus on linear solvers, smoothers – but potential limited by Amdahl’s Law

GPU Acceleration for CFD

Radiation viewfactor calculation (ANSYS FLUENT 14)

© 2011 ANSYS, Inc. 8/29/1116

Performance / Software Milestones

• Scalability, GPUsCurrent HPC Practice

• Customer case studiesDeployment Trends

• ANSYS vision

Agenda

© 2011 ANSYS, Inc. 8/29/1117

HPC for Turbocharger Design

• 8M to 12M element models (ANSYS CFX)

• Previous practice (8 nodes HPC)– Full stage compressor runs 36-48

hours– Turbine simulations up to 72 hours

• Current practice (160 nodes)– 32 nodes per simulation– Full stage compressor 4 hours– Turbine simulations 5-6 hours– Simultaneous consideration of 5

ideas– Ability to address design

uncertainty – clearance tolerance

“ANSYS HPC technology is enabling Cummins to use larger models with greater geometric details and more-realistic treatment of physical phenomena.”

http://www.ansys.com/About+ANSYS/ANSYS+Advantage+Magazine/Current+Issue

© 2011 ANSYS, Inc. 8/29/1118

2005

2011

2009

20073 Millions of Cells(6 Days)

25 Millions(4 Days)

10 Millions(5 Days)

50 Millions(2 Days)

Increase of :

Ø Spatial-temporal Accuracy

Ø Complexity of Physicals Phenomenon

SupersonicMultiphaseRadiation

CompressibilityConduction/Convection

TransientOptimisation / DOEDynamic Mesh

LES CombustionAeroacousticFluid Structure Interaction

HPC for High Fidelity at EuroCFD

• Model sizes up to 100M dells (ANSYS FLUENT)

• 2011 cluster of 700 cores (expansions pending)– 64-256 cores per simulation

© 2011 ANSYS, Inc. 8/29/1119

Solder joint failure analysis• Thermal stress 7.8 MDOF• Creep strain 5.5 MDOF

Simulation time reduced from 2 weeks to 1 day

• From 8 – 26 cores (past) to 128 cores (present)

“HPC is an important competitive advantage for companies looking to optimize the performance of their products and reduce time to market.”

HPC for High Fidelity at Microconsult GmbH

© 2011 ANSYS, Inc. 8/29/1120

“HPC” on the Desktop

•Cognity Limited – steerable conductors for oil recovery

•ANSYS Mechanical simulations to determine load carrying capacity

•750K elements, many contacts

•12 core workstations / 24 GB RAM

•6X speedup / results in 1 hour or less

•5-10 design iterations per day

“Parallel processing makes it possible to evaluate five to 10 design iterations per day, enabling Cognity to rapidly improve their design.

http://www.ansys.com/About+ANSYS/ANSYS+Advantage+Magazine/Current+Issue

© 2011 ANSYS, Inc. 8/29/1121

Case study on the value of HW refresh and software best-practice

Deflection and bending of 3D glasses• ANSYS Mechanical – 1M DOF models

Optimization of:• Solver selection (direct vs iterative)• Machine memory (in core execution)• Multicore (8-way) parallel with GPU acceleration

Before/After:

77x speedup – from 60 hours per simulation to 47 minutes.

Most importantly: HPC tuning added scope for design exploration and optimization.

Desktop HPC at NVIDIA

© 2011 ANSYS, Inc. 8/29/1122

Performance / Software Milestones

• Scalability, GPUsCurrent HPC Practice

• Customer case studiesDeployment Trends

• ANSYS vision

Agenda

© 2011 ANSYS, Inc. 8/29/1123

Cloud computing puts renewed focus on an ongoing challenge:

How can we optimize remote access to HPC for ANSYS users?

Critical emerging issue for

• Public cloud (outsourcing of HPC)• Private cloud (internal remote access /

elastic provisioning)

Remote Access to Simulation / Cloud

© 2011 ANSYS, Inc. 8/29/1124

Level 0 : Local Computing• Pre / Solve / Post on local desktop system• Files stored locally under individual control• Inherent capacity limitations; also limits

collaboration and data management

Level 1: Remote Batch • Pre / Post on local desktop system• Batch solve conducted on ‘remote’ HPC

resource• Bottlenecks related to file transfer and

limitation s of local hardware (e.g., can’t postprocess large files)

• Remote access and job management also challenging.

Levels of Remote Access

© 2011 ANSYS, Inc. 8/29/1125

Level 2: Remote Interactive Workflow

Remote HPC Resource

DataStorage

Pre / Solve / Post / EKMMobileUser doing

“Full Remote Simulation”

Web Browser Access

Remote 3D Visualization

Server

Remote 3DVisualization

•Schedule / provision HW resources•Launch application•Launch remote visualization tool for interactive use•Access data (EKM)•Monitor job progress

• Full simulation process conducted via remote access (Pre/Solve/Post)

• Files reside at the HPC resource – for efficiency, enhanced data management, collaboration

• Feasible and implemented today by best-in-class

© 2011 ANSYS, Inc. 8/29/1126

Simulation Data Management

Web Browser (IE, Firefox, etc.)

Firewall

Desktop Application (ANSYS Workbench, VB, etc.) File Server

(http, ftp)

Relational Database(Mysql, Oracle, DB2, etc.)

Compute Cluster

Application Server(Jboss , etc.)

Content Management Repository (Jackrabbit, etc.)

ANSYS EKM

Storesmetadata

Executes simulationsand extracts data usinga batch system such as

RSM, LSF, SGE

Repository of all archived files and applications

Access results over the full product lifecycle. Re-use of previous simulations . Long-term archival storage.

© 2011 ANSYS, Inc. 8/29/1127

Interface to PLM / PDM

Pull functional requirements or geometry from PLM to ANSYS EKM

• Check update status from EKM

Return simulation results to PLM• Associated to the master model

ANSYS EKM Windchill

Teamcenter

M

atrixOne

Sm

arte am

EKM Datalink

Functional SpecGeometry, Attributes

Simulation Report

© 2011 ANSYS, Inc. 8/29/1128

Public Cloud

Cloud computing could provide cost-effective access to scaled-out elastic infrastructure

• Scale up - extreme problem size, data storage and backup• Surge capacity, intermittent workloads or users• Cost effective balance of CAPEX vs. OPEX

But challenges remain for• IP protection, export compliance and data sharing• Optimized use of on-premise vs. “remote” systems • Remote workflow (visualization, file transfer)

ANSYS focus: • Improved remote simulation workflow (private cloud) • Enable seamless use of 3P hosted cloud

– “Bring Your Own License” model

© 2011 ANSYS, Inc. 8/29/1129

Performance / Software Milestones

• Scalability, GPUsCurrent HPC Practice

• Customer case studiesDeployment Trends

• ANSYS visionSummary / Next steps

Agenda

© 2011 ANSYS, Inc. 8/29/1130

Summary / “Take Home” Points

High-Performance Computing can add great value to your use of ANSYS

• What could you learn from a 10M (100M) cell / DOF model?

ANSYS continues to focus on software development for HPC

• Mission Critical - in order to maintain scaling as the hardware ‘revolution’ continues

We believe that this focus will enable ANSYS to maximize your overall return on investment.

Optimizing IT deployment is a critical challenge• Getting started or scaling out

– Desktop / remote access, data management and archiving, job submission and resource management, etc

© 2011 ANSYS, Inc. 8/29/1131

Next Steps

ANSYS is committed to understanding your IT environment and deployment challenges

ANSYS (and our partners) can provide solutions today – and you can help drive our product strategy.

Questions/Comments: [email protected]

Thank You

© 2011 ANSYS, Inc. 8/29/1132


Recommended