+ All Categories
Home > Documents > Poster P4207 Riccardo Ferrara: [email protected] ... · Agency (ASI) SAR missions (i.e....

Poster P4207 Riccardo Ferrara: [email protected] ... · Agency (ASI) SAR missions (i.e....

Date post: 25-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
1
Riccardo Ferrara: [email protected] Stefano Marra: [email protected] Advanced Computer Systems s.p.a. Via della Bufalotta 378, 00139, Rome, Italy - www.acsys.it Earth Observation SAR data is acquired by transmitting a radar signal and collecting Earth surface echoes from a moving satellite platform during a short time span. To produce the final image the scatters corresponding to the same ground target, along range (cross track) and azimuth (along track) direction, must be concentrated on the same point. Such process, known as focusing, is the purpose of the Omega-K algorithm. The test input data is a range compressed image of 26620x18427 float complex values (respectively in azimuth and range direction), for a total occupation of about 3.7 GB. In order to overcome GPU memory limitation, the algorithm operates by subdividing the grid in a configurable number of azimuth blocks. Once in device memory each block is processed entirely on GPU. CUDA streams allow concurrency of host/device data transfer and kernels execution for subsequent blocks. Follows a diagram of the processing chain for each block. Algorithm description and implementation CPU: 2 x Intel(R) Xeon(R) CPU E5-2660 (2 x 8 cores at 2.20GHz) RAM: 12 GB HDD: 4 x Seagate Cheetah 600GB 15K RPM GPU model: Nvidia Tesla M2090 Memory: 6GB Architecture: Fermi Compute Capability: 2.0 Test Hardware The research has shown the potential of GPU programming to perform near real time SAR data processing. The next step is to port the entire processor on GPU, including the range compression and dop- pler estimation stages. Given the compute bound limit shown on the current hardware, would also be interesting to run the processor on a Kepler GPU and compare the results. Conclusions Marco Fratarcangeli: [email protected] Davide Tiriticco: [email protected] Sapienza University of Rome, DIAG Via Ariosto 25, 00185, Rome, Italy GPU Accelerated SAR Omega-K Focusing A parallel CUDA implementation of Omega-K algorithm for Synthetic Aperture Radar (SAR) data focusing is presented. Comparison with a multithreaded CPU implementation, developed by ACS, currently operating in the context of European Space Agency (ESA) and Italian Space Agency (ASI) SAR missions (i.e. ASAR, Cosmo-Skymed), is shown. A speedup factor of 15X has been registered in the test environment (for the pure algorithmic part, excluding disk I/O operations), without quality degradation of the resulting image. Research SAR data processing is a computational intensive task. To give an idea, the data used for test is a Cosmo-Skymed acquisition of 8 seconds, while an efficient CPU implementation of a SAR processor, takes several minutes to complete, even on a multi processor server conguration. On the other hand the focusing algorithm is inherently highly parallelizable, requiring the application of a series of filter in the frequency domain, where the output of each input value can be computed independently. Motivation 1 Time comparison Follows a time performance comparison of the overall Omega-K stage execution, between the operational parallel CPU implementation and the research GPU implementation, run on the same test machine. The goal of this research is to highlight performance gain in a real operational scenario, thus disk I/O times are included in the comparison 2 Quality comparison The first tests on phases preserving have shown that the implementation meets the requirement defined by ASI for this type of SAR acquisitions. Results The graph values represent the speed-up factors relative to the CPU implementation. The overall focusing time, has dropped from 208 seconds (CPU) to 19 seconds (GPU). The diagram shows a comparison of the focused image range and azimuth resolutions between CPU (blue) and GPU (red) implementations, for two punctual targets. The deviation is below the required threshold. Image Data: Location: Etna Size: 26620 x 18427 (48Km x 44Km) Sensing Time: 2008-04-29 17:22 Satellite: Cosmo-Skymed SAR1 Mode: Stripmap Polarization: HH A batch FFT plan common to all blocks is used, allowing to perform the FFT for all rows with a single operation. The same plan is used for direct and inverse transform. AZ FFT AZ Block Subdivision AZ Block Merge Omega-K Compression The chirp filtering method has been chosen to perform the stolt mapping operation The chirp filter operates in range direction: the azimuth block is transposed before entering the stolt mapping stage, to allow for memory coalesced access. The chirp filter is computed on the fly for each range block in a GPU kernel. The compression filter operates in azimuth direction: a transposition is executed before and after the filter application. Compression filter computation and application is performed within a single kernel, thus avoiding filter storage in global memory. Cosmo-SkyMed Image ©ASI (2008) All rights reserved Stolt Mapping RG Block Subdivision RG FFT RG Block Merge RG IFFT AZ IFFT -1 -0.5 0 0.5 0 50 100 150 200 250 -1.5 -1 -0.5 0 0.5 1 1.5 0 100 200 300 400 500 600 Re(Z) SAMPLES WK COMPRESSION FILTER - REAL PART Compression Filter Bank Chirp Filter Range compressed image Focused image Azimuth(AZ) Azimuth(AZ) Range(RG) CONTACT NAME Riccardo Ferrara: [email protected] POSTER P4207 CATEGORY: VIDEO & IMAGE PROCESSING - VI08
Transcript
Page 1: Poster P4207 Riccardo Ferrara: riccardo.ferrara@acsys.it ... · Agency (ASI) SAR missions (i.e. ASAR, Cosmo-Skymed), is shown. A speedup factor of 15X has been registered in the test

Riccardo Ferrara: [email protected] Marra: [email protected]

Advanced Computer Systems s.p.a.Via della Bufalotta 378, 00139, Rome, Italy - www.acsys.it

Earth Observation SAR data is acquired by transmitting a radar signal and collecting Earth surface echoes from a moving satellite platform during a short time span. To produce the final image the scatters corresponding to the same ground target, along range (cross track) and azimuth (along track) direction, must be concentrated on the same point. Such process, known as focusing, is the purpose of the Omega-K algorithm.The test input data is a range compressed image of 26620x18427 float complex values (respectively in azimuth and range direction), for a total occupation of about 3.7 GB. In order to overcome GPU memory limitation, the algorithm operates by subdividing the grid in a configurable number of azimuth blocks.Once in device memory each block is processed entirely on GPU. CUDA streams allow concurrency of host/device data transfer and kernels execution for subsequent blocks. Follows a diagram of the processing chain for each block.

Algorithm description and implementation

CPU: 2 x Intel(R) Xeon(R) CPU E5-2660 (2 x 8 cores at 2.20GHz)

RAM: 12 GBHDD: 4 x Seagate Cheetah 600GB 15K RPM

GPU model: Nvidia Tesla M2090Memory: 6GBArchitecture: FermiCompute Capability:

2.0

Test Hardware

The research has shown the potential of GPU programming to perform near real time SAR data processing. The next step is to port the entire processor on GPU, including the range compression and dop-pler estimation stages. Given the compute bound limit shown on the current hardware, would also be interesting to run the processor on a Kepler GPU and compare the results.

Conclusions

Marco Fratarcangeli: [email protected] Tiriticco: [email protected]

Sapienza University of Rome, DIAGVia Ariosto 25, 00185, Rome, Italy

GPU Accelerated SAR Omega-K Focusing

A parallel CUDA implementation of Omega-K algorithm for Synthetic Aperture Radar (SAR) data focusing is presented. Comparison with a multithreaded CPU implementation, developed by ACS, currently operating in the context of European Space Agency (ESA) and Italian Space Agency (ASI) SAR missions (i.e. ASAR, Cosmo-Skymed), is shown. A speedup factor of 15X has been registered in the test environment (for the pure algorithmic part, excluding disk I/O operations), without quality degradation of the resulting image.

Research

SAR data processing is a computational intensive task. To give an idea, the data used for test is a Cosmo-Skymed acquisition of 8 seconds, while an efficient CPU implementation of a SAR processor, takes several minutes to complete, even on a multi processor server conguration.On the other hand the focusing algorithm is inherently highly parallelizable, requiring the application of a series of filter in the frequency domain, where the output of each input value can be computed independently.

Motivation

1 Time comparison

Follows a time performance comparison of the overall Omega-K stage execution, between the operational parallel CPU implementation and the research GPU implementation, run on the same test machine. The goal of this research is to highlight performance gain in a real operational scenario, thus disk I/O times are included in the comparison

2 Quality comparison

The first tests on phases preserving have shown that the implementation meets the requirement defined by ASI for this type of SAR acquisitions.

Results

The graph values represent the speed-up factorsrelative to the CPU implementation.

The overall focusing time, has dropped from 208seconds (CPU) to 19 seconds (GPU).

The diagram shows a comparison of the focused image range and azimuth resolutions between CPU (blue) andGPU (red) implementations, for two punctual targets. The deviation is below the required threshold.

Image Data:Location: EtnaSize: 26620 x 18427 (48Km x 44Km)Sensing Time: 2008-04-29 17:22Satellite: Cosmo-Skymed SAR1Mode: StripmapPolarization: HH

A batch FFT plan common to all blocks is used, allowing to perform the FFT for all rows with a single operation. The same plan is used for direct and inverse transform.

AZ FFT

AZ BlockSubdivision

AZ BlockMerge

Omega-K Compression

• The chirp filtering method has been chosen to perform the stolt mapping operation

• The chirp filter operates in range direction: the azimuth block is transposed before entering the stolt mapping stage, to allow for memory coalesced access.

• The chirp filter is computed on the fly for each range block in a GPU kernel.

• The compression filter operates in azimuth direction: a transposition is executed before and after the filter application.

• Compression filter computation and application is performed within a single kernel, thus avoiding filter storage in global memory.

Cosmo-SkyMed Image ©ASI (2008)

All rights reserved

Stolt Mapping

RG BlockSubdivision

RG FFT

RG BlockMerge

RG IFFT

AZ IFFT

-1

-0.5

0

0.5

0 50 100 150 200 250

-1.5

-1

-0.5

0

0.5

1

1.5

0 100 200 300 400 500 600

Re(

Z)

SAMPLES

WK COMPRESSION FILTER - REAL PART

Com

pres

sion

Filt

er B

ank

Chi

rp F

ilter

Range compressed image

Focused image

Azi

mut

h(A

Z)A

zim

uth(

AZ)

Range(RG)

contact name

Riccardo Ferrara: [email protected]

P4207

category: Video & image PRocessing - Vi08

Recommended