+ All Categories
Home > Documents > RSSI 2007 FPGA Acceleration For Production Use

RSSI 2007 FPGA Acceleration For Production Use

Date post: 03-Feb-2022
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
23
RSSI 2007 FPGA Acceleration For Production Use Matthias Fouquet-Lapar Principal Engineer Multi-Paradigm Computing [email protected]
Transcript
Page 1: RSSI 2007 FPGA Acceleration For Production Use

RSSI 2007

FPGA Acceleration For

Production Use

Matthias Fouquet-Lapar

Principal Engineer

Multi-Paradigm Computing

[email protected]

Page 2: RSSI 2007 FPGA Acceleration For Production Use

Slide 2

Challenges for FPGAs in Production Use

• FPGAs remain (b)leading edge technology to the majority of customers

• High Level Language Tools make it easier to program FPGAs, but it remains a major effort :– Flat Application profiles in many areas

– Majority of legacy applications written in Fortran

– Significant investment for a customer with uncertain result

• System Integration (remember we talk HPC, not PC !)– Job Scheduling / Resource Allocation

– Accounting

– Scalability

– RAS

Page 3: RSSI 2007 FPGA Acceleration For Production Use

Slide 3

And HPC typically looks like this :

Page 4: RSSI 2007 FPGA Acceleration For Production Use

Slide 4

Challenges for FPGAs in Production Use

• Lots of different options :– Continuing fast evolution of top-end micro-processors

– Multi-core & many-core micro-processors

– GPGPUs

– Cell

– Dedicated FP accelerators

• Many past performance claims of “orders of magnitude”

speedups really talked about one specific accelerated loop –

the overall benefit to the customer was probably < 10

Page 5: RSSI 2007 FPGA Acceleration For Production Use

Slide 5

Challenges for FPGAs in Production Use

• The (in) famous “X” factor– Any implementation has to compare itself to leading edge micro-processors,

not to a years old processor generations

– This has (and continues) to have a very negative impact for the entire

FPGA community

• Lack of Standardization– SGI recognizes the enormous value Intel’s Quick Assist technology is

bringing to the table

– We are fully engaged in sharing years of experience developing accelerated

solutions as well as 25+ years of experience in HPC

Page 6: RSSI 2007 FPGA Acceleration For Production Use

Slide 6

Driving scalability in HPC

• In-Memory data bases become more important for accelerated

computing

– Largest Memory Configuration installed to date : 40 TeraBytes

• Continuing FPGA scaling

– Scaling of a 4 FPGA BLAST-N Benchmark to 8 FPGAs showed a

linear speedup

– Largest FPGA Configuration tested to date included 30 FPGA

– Next milestone : 128 FPGAs in Single System Image

Page 7: RSSI 2007 FPGA Acceleration For Production Use

Slide 7

Additional Support of X86 based ICE

• Strategic Relationship with Intel working on Quick Assist

• Implementation of an FSB solution for new Application

Spaces

• Focus remains on delivering application and solutions to

customers using 25+ years of in-depth application

experience in HPC

Page 8: RSSI 2007 FPGA Acceleration For Production Use

Slide 8

SGI’s Application Focus

• Government– Classified

• Life Sciences– Bio-Informatics

– Genomics

– Chem Informatics

• Data Management– Encryption

– Content Analysis, Search and Filtering

– Image Processing

Page 9: RSSI 2007 FPGA Acceleration For Production Use

Slide 9

SGI and Life Sciences Application Focus

• Bio-Informatics and Genomic research have an exponential

growth rate

Page 10: RSSI 2007 FPGA Acceleration For Production Use

Slide 10

SGI’s history in life science

• Over 20 years experience providing high performance

solutions for life sciences applications with a dedicated

worldwide team of experts in bioinformatics and

computational chemistry

• Other life-science partnerships include – Gaussian

– Schrodinger

– SCM

– Several open source applications

Page 11: RSSI 2007 FPGA Acceleration For Production Use

Slide 11

Partnership : Creating a Life Science

Appliance

• SGI and Mitrionics share a common vision :

Provide turn-key solutions and appliances to facilitate

easier adoption of accelerator technologies for

academic and industrial partners

• Focus on application areas which have dramatically

increasing computing demands and which are

addressable with today’s FPGA technology

Page 12: RSSI 2007 FPGA Acceleration For Production Use

Slide 12

Our Vision

• Open Systems Scalable Infrastructure

• Open Source

• Create a Bio-Informatics Community adding to the existing application

stack :

– IBLAST (Interactive BLAST)

– Smith-Waterman

– Clustal W

– Needleman-Wunsch

• Industrial quality grade of the implementation

• Workshops to facilitate and enable development

• “Full Care Support Option” for customers who want to run out of the box

Page 13: RSSI 2007 FPGA Acceleration For Production Use

Slide 13

Creating a BLAST-N solution

• From a technical perspective these applications are a good fit for FPGAs

– Integer (actual character/bit sized operands)

– High degree of parallelism

– “Hot spots" in the application profile

– Earlier black box implementations (using ASICs or FPGAs) from companies

such as Paracel or Time-Logic have shown the potential, but • Limited acceptance by customers since there was no way of changing the black-box (taking

the “Programmable” out of FPGAs)

• Limited I/O Bandwidth, Limited system Integration

• Expensive

• too specialized

• non-general purpose

Page 14: RSSI 2007 FPGA Acceleration For Production Use

Slide 14

Design goals for BLAST-N : Easy Integration into

existing customer workflows

• NCBI BLAST is the de-facto standard for the majority of

customers

• An RASC appliance should be able to plug into existing

workflows– Consistent results with NCBI BLAST

– Coherent parameter set with NCBI BLAST

– Option to run either the standard CPU version or the FPGA accelerated

version

– Automatic fallback to CPU versions if all FPGAs are busy

– Work “out of the box”

Page 15: RSSI 2007 FPGA Acceleration For Production Use

Slide 15

… but a customer expects more then a simple

replacement

• Order of magnitude of speedup compared to current

implementations

• Scalable solution – the system can grow with steady

increasing demands

• Not black-box solution :– Open System Architecture

– Open Source Software

• In addition : – Very significant savings in terms of infrastructure (power, cooling) :

green computing

– Reduction of foot-print (machine room size)

Page 16: RSSI 2007 FPGA Acceleration For Production Use

Slide 16

So what did we achieve with BLAST 1.0 ?

released on 15-Jun-2007 on sourceforge.net

• Test case :– 500bp query AB000401 (Mycoplasma capricolum rpmH and dnaA genes,

partial cds) from the EMBL database

– Database set includes the NCBI BLAST benchmark suite’s nucleotide

database (benchmark.nt), the Mouse EST database, and the

Nonredundant Nucleotide (NT) database

• Results :– Using the NCBI BLAST benchmark suite for BLASTN a single 500 bp

query ran slower then the CPU implementation

– Merging 64 x 500bp queries speedup : 10X – 28X

– Merging 256 x 500bp queries speedup : 12X – 56X

Page 17: RSSI 2007 FPGA Acceleration For Production Use

Slide 17

Benchmark Results compared to 4 core

Opteron 8820 SEserver

0

10

20

30

40

50

60

70

6 Large Queries averaging 115,000+ bp

from the Drosophila Genome against the

GenBank Mouse EST Database with 2.1B

bp

Production Run of 3,534 Short Queries

(25bp) against a human genome

database with 4B bp

4-core Opteron

8820 SE server

SGI RASC

Appliance for

Bioinformatics

with 4 FPGAs

0.74

13.33

25

1,490

Speedup

Throughput (queries/min)

Page 18: RSSI 2007 FPGA Acceleration For Production Use

Slide 18

Factoring in Green Computing

0

1

2

3

59

60

61

62

63

Performance

(Queries/Min)

Power

(Watts)

Queries/KWHr

4-core Opteron

8820 SE server

SGI RASC

Appliance for

Bioinformatics

with 4 FPGAs

2,049

128,501

726 69625

1,490

Results Relative

to Opteron Server

Speedup

Page 19: RSSI 2007 FPGA Acceleration For Production Use

Slide 19

Running multiple instances

• Executing multiple instances, very good scalability

Page 20: RSSI 2007 FPGA Acceleration For Production Use

Slide 20

Results

• Wall-Clock throughput improvements for real test cases between 10X – 60X compared to Opteron 2.8 Ghz

• Results are consistent with NCBI’s BLAST-N implementation (< 0.3 error rate)

• Power Consumption per query 90% - 95% less than CPU implementation

• Clear Price / Performance advantage over top-end quad-core micro-processor implementations (leaving out power & cooling cost)

• Attractive complete bundle including Hardware, System Software and the BLAST-N application for less than $40K

Page 21: RSSI 2007 FPGA Acceleration For Production Use

Slide 21

No assembly required

- and you don’t need batteries

SGI RASC Appliance for Bioinformatics

8.25” high chassis for standard 19” racks

Page 22: RSSI 2007 FPGA Acceleration For Production Use

Slide 22

Who is using this ?

• Customers and Beta-Test Installations

– National Cancer Institute / US

– Chinese National Hume Genome Center / Shanghai China

– Merck / Germany

– Universite de Laval / Canada

– University of Arizona / US

Page 23: RSSI 2007 FPGA Acceleration For Production Use

Thank You


Recommended