Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | rudolf-russell |
View: | 215 times |
Download: | 0 times |
The Test and Evaluation Uses of The Test and Evaluation Uses of Heterogeneous ComputingHeterogeneous Computing
Data Fusion
23 July 2010Dan M. Davis
(310)448-8434Approved for public release; distribution is unlimited.
Co-Authors
Prof. Robert F. Lucas and Gene Wagenbreth
Information Sciences InstituteUniversity of Southern CaliforniaMarina del Rey, California 90292
{ rflucas, genew } @isi.edu
Overview
• Configuration and Design Considerations
• GPU Training and Algorithmic Programming
• Current Contributions and Research
Productivity
• Robustness and Utility of GPGPU Cluster
• Plans and Opportunities for Cluster
• Lessons Learned regarding GPGPUs
Thesis
Heterogeneous Computing (GPGPUs, FPGAs, STI Cells, …) holds promise for the future
FMS and T&E have a need for HPCIn 2007 HPCMP awarded a 256-Node,
GPGPU-Enhanced Linux Cluster Joshua, to JFCOM
This asset has proven stable and useful Many of the useful functions of GPGPUs will
be applicable in the T&E community
Joshua GPGPU-Enhanced Linux Cluster at JFCOM
J9/J7 Machine Room
Suffolk Virginia
JFCOM as an GPGPU User
● U.S. Joint Forces Command, Norfolk, Virginia • One of DoD’s combatant commands• Key role in transforming defense capabilities• Current JFCOM Commander: Gen James Mattis, USMC
● Two JFCOM Directorates using agent-based simulations• J7 - Training
- trains forces- develops doctrine- leads training requirements analysis - provides an interoperable training environment
• J9 - Concept Development and Experimentation- develops innovative joint concepts and capabilities - provides proven solutions to problems facing the joint force
● Simulations are typically GenSer Secret and characterized by:• Interactive use by hundreds of personnel • Distributed trans-continentally, but must be real time• Vast majority of users at the terminals are uniformed
warfighters
Simulation Federates
● Agent-based models use rules for entity behavior• Autonomous-agent entities• Can be Human-In-The-Loop (HITL) and run in real time• Large compute clusters required to run large-scale
simulations
● Standard interface is HLA RTI communication (IEEE 1516)• Supplanted to old DIS• Publish/Subscribe model• USC/Caltech Software Routers scale better
● Common Codes in use at JFCOM:• Joint Semi-Automated Forces (JSAF)• “Culture”, stripped-down civilian instantiation of JSAF• Simulating Location & Attack of Mobile Enemy Missiles
(SLAMEM)• OneSAF
GPGPU Justification
NEED - 24x7x365 enhanced, distributed and scalable compute resources to enable joint warfighters at JFCOM … to develop, explore, test, and validate 21st century battlespace concepts … to enhance global-scale, computer-generated military experimentation by sustaining more than 2,000,000 entities on appropriate terrain with valid phenomenology.
APPROACH – Enable further growth in entity count, entity complexity, and environmental/Infrastructure settings by employing large Linux cluster with General Purpose GPUs (GPGPU) on each node to aid in line-of-sight, route planning, plume representation, all capable of running faster than real time.
CHALLENGES – Effectively implementing Hardware configuration to provide stable and useful platform, motivate/train operators to utilize GPGPUs, and program simulations to take advantage of GPGPUs
Cluster Configuration as
Delivered
• 256 NodesNodes - (2) AMD Santa Rosa 2220 2.8 GHz dual-core
processorsGPUs - (1) NVIDIA 8800 Video CardNode Chassis - 4U chassisMemory - 16 GB DIMM DDR2 667 per node
• GigE Inter-node Communications• Delivery to:
Joint Advanced Tactics and Training Laboratory (JATTL) in Suffolk, VA
Perspective:Entity Growth vs.
Time
Nu
mb
er
an
d
Com
ple
xit
y o
f JS
AF
En
titi
es
JSAF/SPP Joshua (2008)
10,000,010,000,00000
UE 98-1
(1997)
JSAF/SPP Capability (2006)
JSAF/SPP Urban
Resolve (2004)
JSAF/SPP
Tests (2004)
J9901 (1999)
SAF Expres
s (1997)
3,600 3,600 12,000 12,000 107,00107,00
0 0
AO-00 (2000)
50,000 50,000
1,400,001,400,00
1,000,001,000,0000
250,000250,000
SPP Proof of Principle DARPA / Caltech
Experiments continue to require orders of magnitude larger &
more complex battlespaces
SCALEand FIDELITY
DC Clusters at MHPCC & ASCMSRC
DHPI GPU-
Enhanced Cluster
Why GPUs?
●GPU performance can be 100X hosts• Don’t forget Prof. Gene Amdahl, 2-3X
typical• This differential is expected to grow
●Early OneSAF work (UNC & SAIC)• Line of Sight• Route Finding• Collision Detection• Sparse Matrix Factorization (see RFLucas
paper)
●ISI verified they’re also bottlenecks in JSAF
●New ideas for use in sensor scenario creation for new multi-spectral sensors
Route Planning Performance Impact
Time Spent in Route Planning is Critical Bottleneck
Early GPU Programming
● Trained ISI staff with Sparse Matrix Solver● Then examined JSAF kernels
• Line-of-sight• Illumination • Route planning
● Route planning appeared easiest to integrate● Route planning work published at I/ITSEC● For this and other papers, see:
http://www.isi.edu/~ddavis/JESPP/JESPP_Papers.html
CUDA TrainingPET Courses
●Dr. David Pratt conceived and organized • HPCMP FAPOC for FMS
●Location & Dates: • SAIC facility Suffolk VA, 23 - 25 October
2007• ISI Marina del Rey 21- 23 October 2008• UCSD San Diego 5 – 6 March 2009
●Attendees: total ~ 60 HPCMP users●Also taught at USC as part of Parallel
Programming Class
Views of CUDA Classes in Suffolk
Virginia
Typical Problem at JFCOM
The Joint Force Commander (JFC) needs to integrate and focus collection assets for persistent surveillance.
Joint Integrated Persistent Surveillance (JIPS) Goals :Improving and integrating systemDeveloping Tactics, Techniques and Procedures (TTP’s)Improving all of the Concept of Operations (CONOPS)Maximizing tipping, cueing and communications.Using sensors to achieve persistence
Improving doctrine, organization, and TTPs
Enabling JFC to better command and support operations by: (1) effective capability apportionment and management,
(2) timely and responsive analytic support
(3) fast, reliable tactical Command, Control, Communications (C3)
Enhancing use, coordination and optimization of ISR assets
JIPS User Interface
Experimental Schedule
Benefits of GPGPU Computing
Joshua has provided many benefits; some are not easily quantified
Training, analysis or evaluation in cities otherwise off-limits due to:security issuespublic resistance to combat troops in their citydiplomatic about U.S. interest in cities of potential conflict
Joshua does save personnel costs, e.g. Army Division costs ~ $20M per day.
DHPI cluster can runs such a program using only ~100 technicians Cost saving may be ~$19.5M each day.
Good visibility with the leadership elite:Congressional visitsLieutenant General noted that it was probably the only time in his
career he would have an opportunity to command so large a unit
1,500 soldiers across the country participated, all connected by DREN to the cluster Joshua in Suffolk.
GPGPU Technical Merit
● All challenges in the proposal fully met● Joshua remains deployed and in service ● Two million entity goal exceeded (by factor of
five!)● Capability of GPU demonstrated
• Developers trained to use GPUs• Route planning kernel implemented• Other research underway
● Joshua has changed the J9 culture• New code being developed using client/server model• J9 leadership now have ownership stake in HPC
concepts
GPGPUComputational
Merit● JFCOM FMS requirements are uniquely military• Modeling of DoD operations in urban terrain• Users are most often uniformed warfighters• Recipients of research benefits are in the field today
● Needed for a large, heterogeneous ensemble of SAFs
● Cluster provides stability and mesh provides utility
● Nationally recognized research challenges• Scalable interest management to bound messages• Scaling individual behavior models• Mining distributed data logs to analyze results• More than 31 papers in competitive conferences/journals
GPGPUCurrent Progress
● Deployed and accepted at JFCOM● In use on all major J9 experiments● In use daily during development spirals for
events● Exceeded technical goal of hosting 2M entities● Classification issues led to partitioning● Joshua is now fully engaged in day-to-day
simulation experiments at JFCOM• Running ensembles of SLAMEM simulations
● Ops-tempo was expected to continue and increase• Human-in-the-loop experiments in FY10
SummaryAppropriateness
● Dedicated system was required• classified• interactive use• development not amenable to batch processing
● Linux cluster • users have adapted easily and use constantly• design and use based on experience with DC
clusters• current SAFs need only Low-cost GigE network
● Joshua has met JFCOMs requirements• in service creating data for JIPS• available for new directions
New Capabilities for T&E
Paper in Real Time Hyper-Spectral
Other T&E UsesAny line of sight calcualtionsEquation-based CFDSignals ProcessingMatrix multiply
Research Funded by JFCOM and AFRL
This material is based on research sponsored by the U.S. Joint Forces Command via a contract with the Lockheed Martin Corporation and SimIS, Inc., and on research sponsored by the Air Force Research Laboratory under agreement numbers F30602-02-C-0213 and FA8750-05-2-0204. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government. Approved for public release; distribution is unlimited.