Date post: | 16-Dec-2015 |
Category: |
Documents |
Upload: | rudolph-fleming |
View: | 219 times |
Download: | 3 times |
Stream Processing ofX-ray Microdiffraction Data
on Multicores
Yuzhen Xie, University of Western Ontario (UWO)
joint work withAlain Biem, IBM ResearchMichael A. Bauer, UWOStewart McIntyre, UWO
Nobumichi Tamura, Lawrence Berkeley National Lab
AMMCS, July 2011
Motivation• Efficiently use of multi-core processors to process large
blocks of synchrotron XRD data generated at high rates (1 to 10 images per second of each 4MB)
• Develop high-performance kernels to achieve near real-time data analysis for synchrotron experiments, the goal of the Active Network Interchange for Scientific Experimentation (ANISE) project
Synchrotron X-ray White-beam Microdiffraction
Incident X-ray (5 – 30 KeV)
CCD Camera
Sample
Diffracted beams
Dectris Pilatus 1M CCD at ALS (2010): sub-second readout
An image showing the Laue microdiffration pattern of a unit-cell in a crystal sample
Process of Laue Patterns for Micro-texture Analysis
Background fit and removal (optional)
Example of Crystallographic Orientation and Strain Maps (courtesy: Jing Chao and Marina Fuller, UWO)
Strain map, average strain: 9.92 x 10-3
Result by XMAS (X-ray Microdiffraction Analysis Software), Advanced Light Source
Orientation map
Reference Software Packages
• XMAS (X-ray Microdiffraction Analysis Software), Advanced Light Source
• 3D X-ray Microdiffraction Analysis Software Package in IDL, Advanced Photon Source
• A prototype of C code for a selection of features in Laue pattern analysis, Science Studio and ANISE projects, UWO
Best sequential processing time: 25 to 50 seconds per image
77
Stream Processing Illustration
Continuous Ingestion Continuous Analysis
8
IBM Streams Programming Model
Streams Processing Language (SPADE)
Input OutputProcess
Platform optimized compilation
Laue XRD Processing System on Streams
Processing Elements (mainly User-defined Operators (UDOPs))
Preprocessing
-Formatting-Parsing-XRD image data
BackgroundRemoval
BlobSearching Peak Fitting Indexing
-Blobs search-Scheduling for parallel peak fitting
XRD
Im
age
Str
eam
Filters available-Parabolic-2D Bruckner-2D Mean Filter
- Lorentz - Gaussian- Pearson VII
Split operator
Functions Available
Strain
BundleSorting
Key Implementation Techniques
• Efficient Source operator for parsing image files: block reading and type casting
• Fine-grained pipelining and cache-efficient background filters
• Memory-efficient parallel peak fitting
• Organize common parameter values as a stream for shared-use in indexing and strain analysis
A Fine-pipelined Background Filter based on Parabolic Method
A Pilatus TIFF Image before and after Background Removal
Memory-efficient Parallel Peak Fitting
Data Management: the Key Issue Blob center b: data set Rb (db x db) is needed for fitting a peak with center at p. Peak center p: data set Rp is needed for integrated intensity computation.
Assume p is not far from b. Define R to be the square region (2db x 2db) with center at b. Attach a data set R to a blob tuple rather than passing the whole image to each peaking fitting element. Determine Rb and Rp by coordinate mapping in R.
Small data size, good locality, no memory contention, …, and hence efficiency.
A SPADE Code Snippet for Blob Searching and Parallel Peak Fitting
## Parse an imagestream engStream(height: Integer, width: Integer, emax: …, evalues: DoubleList):= Source()[“file://c4-3_001.spe”,udfbinformat=“speParser”, blocksize=65536*15]{}
## Search blobs and generate blob stream
stream blobStream( groupid: Integer, blobid: Integer, …, lroi: DoubleList):= Udop(engStream)[“blobSearch”]{np=“NUM_PF”}
## Split blobs to subgroupsfor_begin @c 0 to NUM_PF-1 stream subBlobStream@c(groupid: Integer, blobid: Integer, …, lroi: DoubleList)for_end := Split(blobStream)[groupid]{}
## Parallel peak fitting for subgroups of bobs and bundle all peaks together
bundle peakBundle := ()for_begin @c 0 to NUM_PF-1 stream subPeakFitStream@c(numblobs: Integer, x: Integer, …, inten: Double) := Udop(subBlobStream@c)[“peakFitting”]{}peakBundle += subPeakFitStream@cfor_end
Organize Common Parameter Values as one Stream for Shared-use in Indexing and Strain Refinement of all XRD Images
kin
q3
q1 q2
31
2
kout
q
2
Known crystal structure and energy range (5-30 keV)
List of peak positions on the CCD
Find triplets 1, 2, 3 (thus q1,q2,q3) matching calculated and measured values within a given angular tolerance
Calculated qhkl list of reflections
Experimental qi list of reflections
Choose triplets indexing the largest number of reflections within a given angular tolerance. Look for “missing” reflections.
Beam direction kin,
Detector position and dimensions
Strain refinement
Streams Live Graph: One Pipeline with 4 Processing Elements for Parallel Peak Fitting
2.5 seconds per image (2084*2084) on an Intel Core2 Quad CPU Q9550 (2.83 GHz, 8 GB RAM and 6 MB L2 cache)
ImageSourcing
ImageSourcing
Blob Search&
Scheduling
Blob Search&
SchedulingParallel
Peak FittingParallel
Peak Fitting IndexingIndexingParameterSourcing
ParameterSourcing StrainStrain
Super-linear speedup obtained on an Intel Core2 Quad CPU Q9550
Streams Live Graph: 4 Pipelines to Process 4 Images Concurrently in Streaming Mode
Conclusion
• We present the first stream processing application in the field of synchrotron XRD data analysis.
• We show that stream processing is an effective model for efficiently using multicore processors for XRD image data analysis.
• Our system provides a high-performance processing kernel to achieve near real-time data analysis of image data from synchrotron experiments.
• Our work-in-progress include: evaluation, optimization, configuration and deployment of this kernel to large systems with many cores to process large set of XRD images in parallel and streaming mode.
Thank You!