Design of A Custom Vector Operation API Exploiting
SIMD Intrinsics within JavaPresented by John-Marc Desmarais
Authors: Jonathan Parri, John-Marc Desmarais, Daniel Shapiro, Miodrag Bolic and Voicu Groza
Overview Introduction What is SIMD? jSIMD Userflow Issues and Considerations Java Native Interface Current Implementation Results Future Work Conclusion
carg.site.uottawa.ca
CARG 2010
Introduction
carg.site.uottawa.ca
SIMD(Single Instruction Multiple
Data)• Many embedded systems have
begun to take advantage of the Java framework.
• JVMs can be embedded or rest on top of the OS.
• SIMD is an often under utilized option available on many processors. (O3 compilation)
• In Java it is up to the JVM to best decide how to use SIMD if available at runtime.
CARG 2010
SIMD
carg.site.uottawa.ca
CARG 2010
Functional Unit
Functional Unit
Functional Unit
Functional Unit
Functional Unit
Functional Unit
…Data
Multiple processing elements that performs the same operation on data simultaneously.
Single Instruction Multiple Data
Instruction
Common SIMD Implementations
• AMD 3DNow!• Intel MMX• SSE (Streaming SIMD Extensions)
• AltiVec from Apple, IBM and Freescale
• VIS from Sun Microsystems
x86/x64
PowerPC
SPARC
xmm1xmm0
xmm3xmm2
128 bits
jSIMD: User Flow
carg.site.uottawa.ca
SIMD(Single Instruction Multiple
Data)
Current SIMD Optimization Java Flow
CARG 2010
jSIMD SIMD Optimization Approach
Standard Java
Profile at Runtime
Change Java Code
Runtime JVM to SIMD Mapping May take very long or may not even achieve best SIMD usage
Standard Java with
jSIMD
Issues & Considerations
carg.site.uottawa.ca
SIMD(Single Instruction Multiple
Data)
Packing•Packaging and aligning data into SIMD registers is very time consuming.Transactional•Intermediate values should not leave SIMD memory and register space.Target Specifics•Various targets have different SIMD implementations. (May not even exist, fallback)
CARG 2010
jSIMD: SIMD for Java
carg.site.uottawa.ca
SIMD(Single Instruction Multiple
Data)
Java and the JNI•Java allows programs to use native libraries. •SIMD instructions can be called manually from native code.•Solution! Map all SIMD intrinsic into JNI making them invisible to the Java programmer.•No system specific code/headers are permitted in the library so compilation can be performed automatically on any platform.
CARG 2010
jSIMD: SIMD for Java
carg.site.uottawa.ca
SIMD(Single Instruction Multiple
Data)
Current Implementation
Running Targets:Intel x86/x64AMD x86/x64SPARCPowerPC
Future Targets:NIOS II with custom SIMD Unit
CARG 2010
jSIMD: User Perspective
carg.site.uottawa.ca
SIMD(Single Instruction Multiple
Data) • Extended Java ISA with parallel SIMD operations.
• Native operations hidden as Java methods.
• User is not concerned with native interface.
Transparency
CARG 2010
JavaBase Java
ISA
jSIMD API
Native SIMD
Mappings in C
Results
carg.site.uottawa.ca
SIMD(Single Instruction Multiple
Data)
JVM versus Programmer Know-How
CARG 2010
JVM does an impressive job at SIMD mapping but is not as effective as a determined programmer with an understanding of the underlying target architecture.
Results
Design Space Exploration
carg.site.uottawa.ca
Multiprocessors(Add more processors)
Memory Hierarchy
Coprocessors & Hardware
Accelerators
Multi-Core Interconnect Topology
SIMD(Single Instruction Multiple
Data)
ISE(Instruction Set Extensions)
CARG 2010
GPU(Graphical Processing Units)
Future Work
carg.site.uottawa.ca
SIMD(Single Instruction Multiple
Data)
Future as DSE Avenue
Manual Specification and Automatic Detection for Native SIMD
Rewrite Java for jSIMD
Vectorize
Analysis
Profile
PARRI:CARG 2010
Conclusion
PARRI:CARG 2010
carg.site.uottawa.ca
• VMs should integrate this approach into their languages.
• Until such time as VM support is made available, programmers can use our API to accelerate their applications.
We have shown that jSIMD can be used to accelerate VM-based applications more effectively than contemporary automated solutions.