Date post: | 23-May-2018 |
Category: |
Documents |
Upload: | nguyenminh |
View: | 216 times |
Download: | 1 times |
Alan D. George, Ph.D. Director, NSF CHREC Center
Professor of ECE, University of Florida
Herman Lam, Ph.D. Assoc. Professor of ECE, University of Florida
2012 NSF I/UCRC Annual Meeting
CHREC and Novo-G: An Innovative and Synergistic Research Project and
The World’s Most Powerful Reconfigurable Supercomputer
Research highlighted in this presentation was supported in part by the I/UCRC Program of the
National Science Foundation under Grant No. EEC-0642422.
12 Jan 2012
2
Outline
CHREC Center overview
CHREC sites, faculty, & students
CHREC members & memberships
Industry impact & technology transfer
Reconfigurable computing
Novo-G Overview
Machine architecture
Application acceleration
International Novo-G forum
Conclusions and Looking Ahead
I/UCRC grant originated in Sep. 2006
• 1 university site, 9 membership commitments
Strong growth in first 5 years (Phase-I)
• Grown to 4 university sites (UF, GW, BYU, VT)
• Grown to 29 members (aerospace, IT, etc.)
• Grown to 42 memberships (all full, $35K/ea)
Strong scholarship record
• >115 refereed journal & conference papers
• Several NSF CAREER awards
• Best-paper awards, keynotes, etc.
Strong graduation record
• Dozens of Ph.D. & M.S. graduates to date
• Many hired by CHREC members
• Dozens more served with members as interns
World-class facilities developed in-house
• Novo-G: world’s top reconfigurable computer
• HokieSpeed: GPU-centric supercomputer
• Pyramid: CPU-centric supercomputer
Center Mission and Theme
• R&D to advance S&T in nexus of reconfigurable,
high-performance, and/or high-performance
embedded computing (i.e., RC, HPC, HPEC)
• Computing performance, power, adaptivity,
scalability, productivity, cost, size, weight, etc.
• From space satellites to supercomputers!
4
University of Florida (lead) Dr. Alan D. George, Professor of ECE – Center Director
Dr. Herman Lam, Associate Professor of ECE
Dr. Ann Gordon-Ross, Assistant Professor of ECE
Dr. Greg Stitt, Assistant Professor of ECE
Dr. Jose Principe, Distinguished Professor of ECE and BME
Dr. Andy Li, Associate Professor of ECE
Dr. Vikas Aggarwal, Research Scientist in ECE
Brigham Young University Dr. Brent E. Nelson, Professor of ECE – BYU Site Director
Dr. Michael J. Wirthlin, Professor of ECE
Dr. Brad L. Hutchings, Professor of ECE
Dr. Michael Rice, Professor of ECE
George Washington University Dr. Tarek El-Ghazawi, Professor of ECE – GWU Site Director
Dr. Vikram Narayana, Assistant Research Professor in ECE
Virginia Tech Dr. Peter Athanas, Professor of ECE – VT Site Director
Dr. Wu-Chun Feng, Associate Professor of CS and ECE
Dr. Patrick Shaumont, Assistant Professor of ECE
Dr. Heshan Lin, Senior Research Associate in CS
Most importantly,
CHREC features an
exceptional team of
>40 graduate
students spanning
our 4 university sites.
CHREC Faculty
5
CHREC Members 1. AFRL Munitions Directorate (4)
2. AFRL Sensors Directorate
3. AFRL Space Vehicles Directorate (2)
4. Altera
5. AMD
6. Arctic Region Supercomputing Center (2)
7. Army RD&E Command
8. Boeing Research & Technology
9. GiDEL
10. Harris
11. Honeywell (2)
12. Intel
13. Lockheed Martin MFC
14. Lockheed Martin SSC
15. Lockheed Martin SVIL
16. Los Alamos National Laboratory (2)
17. Mentor Graphics
18. Monsanto
19. NASA Goddard Space Flight Center
20. NASA Marshall Space Flight Center
21. National Instruments (2)
22. National Security Agency (4)
23. Northrop-Grumman Aerospace Systems
24. Oak Ridge National Laboratory (2)
25. Office of Naval Research
26. Sandia National Laboratories
27. SEAKR Engineering
28. Veritomyx
29. Xilinx (2)
42 memberships ($35K/ea) from 29 members in 2011
Industry Impact & Tech Transfer
12 projects spanning broad areas of RC, HPC, HPEC Performance – optimizing speed, power, scalability, adaptability
Parallel algorithms, applications, architectures (FPGA, GPU, Manycore)
Productivity – reducing design complexity for developers and users
Design concepts, tools, modeling, middleware, compilation, integration
Aerospace – addressing unique needs in this key community
Space-based processing, reliable architectures, partial reconfiguration
Industry impact CHREC drives & influences many industry programs
Annual surveys routinely cite millions of $ per year in industry impact
Many very close relationships between sites & members
Technology transfer to date Dozens of industry personnel hires, dozens of internships
>115 new papers and >30 new tools crafted with/for members
6
7
What is Reconfigurable Computing? General characteristics:
Architecture adapts to match unique needs of each app
e.g., FPGA; “Custom Fit” usage strategy; reconfigurable by task or app
Relatively new and revolutionary paradigm of computing
Limited but growing list of available devices, tools, systems, and apps
Technical advantages:
GREAT performance when app not well suited to fixed processor
Why? Customized hardware parallelism (width, depth), data precision
(size, format), operations and units (type, quantity), memory structure, etc.
LOWER energy consumption than fixed processors (CPU, GPU)
Technical disadvantages:
Relatively new and immature paradigm of computing
Programming complexity with adaptive hardware
Causes: inherent with novelty of approach; “newness” of field and tools
8
What is Novo-G? Motivation
Growing computational demands in many science and
engineering domains becoming principal bottleneck
Scalable RC systems (e.g., Novo-G) uniquely capable
of both high performance and low energy, cooling, TCO
Goals: Investigate, develop, evaluate, & showcase:
Most powerful RC machine ever fielded for research
Innovative suite of productivity tools for app development
Impactful set of scalable kernels/apps in key science areas
Emphases
Performance (system), Productivity (concepts/tools), Impact (apps)
Theme
Novo-G is an RC-centric machine (not merely CPUs with accelerators!)
Features FPGA/RAM coupling (4.25 or 8.5 GB in 3 banks coupled to each FPGA)
Features FPGA/FPGA coupling (up to 8 coupled; e.g., systolic array, virtual FPGA)
CPUs and GPUs serve in supporting role (e.g., I/O, preprocessing, postprocessing)
9
Novo-G Machine
1 head-node server (1U) with:
• 2 Xeon E5520 2.26 GHz quad-core CPUs
• 24GB ECC DDR3, 3 x 1TB SATA2
24 compute servers (4U), each with:
• Xeon E5520 quad-core CPU
• 6GB ECC DDR3, 250GB SATA2
• 2 GiDEL ProcStar-III PCIe x8 cards, each
with 4 Stratix-III E260 FPGAs and
4x4.25 = 17GB RAM
6 compute servers (4U), each with:
• 2 Xeon E5620 2.4GHz quad-core CPUs
• 16GB ECC DDR3, 2TB SATA2
• 4 GiDEL ProcStar-IV PCIe x8 cards, each
with 4 Stratix-IV E530 FPGAs and
4x8.5 = 32GB RAM
• GTX-480 GPU
* Our cluster vendor is Ace Computers
Novo-G Annual Growth
2009: 96 top-end Stratix-III FPGAs,
each with 4.25GB SDRAM
2010: 96 more Stratix-III FPGAs,
each with 4.25GB SDRAM
2011: 96 top-end Stratix-IV FPGAs,
each with 8.5GB SDRAM
2012: 96 more Stratix-IV FPGAs,
each with 8.5GB SDRAM
192
FPGAs
96
FPGAs
10
Impactful Novo-G App Research: BioRC examples
Each 3D chart (for Smith-Waterman and Needleman-Wunsch) illustrates performance of a single FPGA under varying
input conditions. Each table shows scaling performance with varying number of FPGAs under optimal input conditions.
Jaguar supercomputer @ ORNL: 224,256 cores (2.4 GHz Hexacore Opterons) @ 6.95 MWs
K Computer in Japan (largest supercomputer in world): 548,352 cores ; “uses enough electricity to power almost 10,000 homes at a cost of about $10 million per year” (New York Times - 06/19/11)
Baseline: 192∙225, length 850 Sequence Comparisons
Software Runtime: 11,026 CPU∙hours on 2.4GHz Opteron
# FPGAs Runtime (sec) Speedup
1 47,616 833
4 12,014 3,304
96 503 78,914
128 391 101,518
192 (est.) 270 147,013
Needleman-Wunsch (NW)
By contrast, with 192+192 FPGAs (Summer 2012), for key BioRC
apps, Novo-G speedup approaching 500K cores @ <16KW
Baseline: Database length 226 Bases v 512, length 500 Seqs
Software Runtime: 7,126 CPU hours on 2.4 GHz Opteron
# FPGAs Runtime (sec) Speedup
1 25,927 989
4 6,482 3,958
96 271 94,639
128 206 124,710
192 (est.) 137 187,492
Smith-Waterman (SW)
with trace-back Optimal alg. for local alignment
of DNA and RNA sequences
Needleman-Wunsch (NW) Optimal alg. for global alignment
of DNA and RNA sequences
Novel systolic array
architecture Complex-controller performance
with simple-controller overhead
Extendable across FPGAs using
neighbor bus
Computation of trace-back for
SW overlapped with hardware
processing of next sequence
Smith-Waterman w/ trace-back
Data
bas
e L
en
gth
(N
ucle
oti
de
s)
Sp
ee
du
p
Smith-Waterman with trace-back
Sequence Length (Nucleotides)
Technology Transfer
CHREC BLAST Toolset (Monsanto) Computation demand in bioinformatics
becoming prohibitive bottleneck
Novo-BLAST: accelerates BLAST’s word
matching algorithm up to 19x on single
Stratix III
BLAST-wrapped SW: Smith-Waterman
core (previous slide) with BLAST
wrapper; SSEARCH-like accuracy with
BLAST-like performance
Code transfer & field test in 1st qtr. 2012
Isotopic Pattern Calculator
(Veritomyx) Dominating bottleneck in proteomics app
for cancer research
Measured up to 470x speedup for
single Stratix IV FPGA
Code transfer & field test in 1st qtr. 2012
11
Broad Range of Novo-G App Research Broad range of Novo-G research
BioRC Smith-Waterman (w/ or w/o
traceback), Needleman-Wunsch, Needle-Distance, Isoformic proteomics, BLASTp (collaboration with Boston University), CHREC BLAST Toolset (Novo-BLAST and BSW: BLAST-wrapped SW)
FinRC: e.g., Barrier options using Heston model
DSP: e.g., Information-Theoretic approach to image segmentation
Domain exploration in other science and engineering fields
Very promising results (speed, energy)
50x to 5000x speedup per FPGA
vs. fast CPU core
International Novo-G Forum
Founded in January 2010
International community research forum to explore performance,
productivity, and sustainability of RC at scale
Consists of 11 academic teams using common platform
Each team working on its own research apps and/or tools
Each team has one or more local Novo-G quad-FPGA boards
Remote access to big Novo-G @ Florida for large-scale runs
12
Boston University
Clemson University
Federal University of
Pernambuco (Brazil)
University of Florida
George Washington University
University of Glasgow (UK)
Imperial College (UK)
Northeastern University
University of South Carolina
University of Tennessee
Washington University in St. Louis
RC: revolutionary paradigm of computing Architecture adapts to match unique needs of each app
CHREC Novo-G reconfigurable supercomputer Most powerful RC machine ever fielded for research
World-class speedups for key apps in science and engineering
Rivaling the world’s largest conventional supercomputers
But at a tiny fraction of their size, power, cost, and weight
Synergistic activity
Leverages private, state, and federal funding resources
Close partnership with CHREC member organizations:
Altera, GiDEL, Monsanto, Veritomyx, et al.
Novo-G Forum: international team of 11 universities
Novo-G future: science and engineering domain exploration New RC-amenable apps in BioRC, DSP, and FinRC
Explore new promising domains e.g., computational chemistry, cryptanalysis
Conclusions and Looking Ahead
13
CORBI anyone?