. RESEARCH PAPER .
SCIENCE CHINAInformation Sciences
June 2012 Vol. 55 No. 6: 1280–1289
doi: 10.1007/s11432-011-4316-6
c© Science China Press and Springer-Verlag Berlin Heidelberg 2011 info.scichina.com www.springerlink.com
A universal adaptive vector quantization algorithmfor space-borne SAR raw data
QI HaiMing1,2∗, HUA Bin1,2, LI Xin1,2, YU WeiDong1 & HONG Wen1,2
1Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China;2National Key Laboratory of Microwave Imaging Technology, Beijing 100190, China
Received July 2, 2010; accepted April 23, 2011; published online September 9, 2011
Abstract Codebook of conventional VQ cannot be generally used and needs real time onboard updating,
which is hard to implement in spaceborne SAR system. In order to solve this problem, this paper analyses the
characteristic of space-borne SAR raw data firstly, and then utilizes the distortion function of multidimensional
space as criterion, and finally the adaptive code book design algorithm is proposed according to the joint
probability density function of the input data. Besides, the feasibility of the new algorithm in cascade with
entropy coding and the robustness of the algorithm when error occurs during transmission are analysed based
on the encoding and decoding scheme. Experimental results of real data show that codebook deriving from the
new algorithm can be generally used and designed off-line, which makes VQ a practical algorithm for space-borne
SAR raw data compression.
Keywords synthetic aperture radar (SAR), compression, raw data, block adaptive quantization (BAQ),
vector quantization (VQ)
Citation Qi H M, Hua B, Li X, et al. A universal adaptive vector quantization algorithm for space-borne SAR
raw data. Sci China Inf Sci, 2012, 55: 1280–1289, doi: 10.1007/s11432-011-4316-6
1 Introduction
In the last few years, modern SAR systems have been developing towards high resolution, wide swath,
multi-polarization, multi-frequency, and multi-operation mode. The quantity of SAR raw data is even
larger which easily exceeds the capacity of the downlink channel and the data storage onboard. Therefore,
new compression algorithms according to the characteristics of SAR raw data is needed to solve the
problem of data storage and transmission.
Various raw data compression algorithms have been proposed in the past thirty years. These methods
can be generally divided into three categories: 1) scalar compression algorithms, 2) vector compression
algorithms, 3) transform domain compression algorithms. Scalar compression algorithms include BAQ [1]
(block adaptive quantization), FBAQ [2,3] (fuzzy BAQ), BFPQ [2] (block floating point quantization),
ECBAQ [2,3] (entropy-constrained BAQ), FBAQ [4,5] (flexible block adaptive quantization), and
AP [6] (amplitude and phase).Vector compression algorithms include VQ [7] (vector quantization),
BAVQ [2,3] (block adaptive vector quantization), BGAVQ [8] (block gain adaptive vector quantiza-
tion), TCVQ [3] (trellis coded vector quantization), ECVQ [9] (entropy- constrained vector quantization).
∗Corresponding author (email: qi [email protected])
Qi H M, et al. Sci China Inf Sci June 2012 Vol. 55 No. 6 1281
Transform domain compression algorithms include FFT-BAQ [2,10], WHT-BAQ [10], WT [2,3] (wavelet
transform) and CS [11,12] (compressed sensing). Among those algorithms, BAQ has been successfully
utilized in Magellan [1], ENVISAT ASAR [13], and TerraSAR-X1) due to its good tradeoff between
performance and complexity. With the restriction of algorithm complexity, transform domain algorithms
and vector quantization algorithms are very difficult to be implemented onboard, but their performances
are better than that of the scalar compression algorithms.
The key of the vector quantization algorithm is to design an effective codebook. Codebook of con-
ventional VQ cannot be generally used; therefore real time updating onboard is necessary. However, it
cannot be implemented since the code book searching algorithm is very complex and the onboard hard-
ware resource is also limited. When the codebook is not generated onboard, the performance could be
good only if the statistical properties of the input data match that of the training sets. Otherwise, the
performances of conventional VQ will deteriorate dramatically, which result in the degradation of image
quality.
In order to overcome this problem, this paper firstly reviews the relationship between the standard
deviation of the input signal (SDIS) and the average signal magnitude (ASM) [14,15], and then gives
the multivariate statistical model of raw data and utilizes the distortion function of multidimensional
space as criterion. Finally, the adaptive code book design algorithm is proposed according to the joint
probability density function of the input data. Besides, this paper analyses the feasibility of the new
algorithm in cascade with entropy coding as well as the robustness of the algorithm when error occurs
during transmission.
This paper is organized as follows: section 2 studies the statistical properties of SAR raw data and the
multivariate statistical model of raw data. Section 3 introduces the principle of VQ as well as the design
of codebook based on Gaussian density function. Section 4 gives the encoding and decoding scheme for
raw data vector quantization. In section 5, the feasibility of the new algorithm in cascade with entropy
coding and the robustness of the algorithm when error occurs during transmission are analyzed based on
the encoding and decoding scheme. In section 6, experiments are carried out based on real data, and the
results show that the performance of the new algorithm is better than conventional VQ. Conclusions are
drawn in the last section.
2 Characteristics of SAR raw data
2.1 Statistical properties of SAR raw data
SAR echo can be viewed as a superposition of the response of many small scatters in each azimuth
and range resolution cell. Based on the central limit theorem, the in-phase (I) and quadrature (Q)
components both obey Gauss distribution. The amplitude obeys Rayleigh distribution and the phase is
normally distributed on the interval [−π,π].
Since the received signals in both I and Q channels are zero-mean Gaussian, the distribution can
be specified with a single parameter. For simplicity, the parameter ASM is selected in the engineering
implementation. For an 8-bit ADC, ref. [1] gives the relationship between ASM and SDIS. Ref. [14]
revised it as
|I| = |Q| = 127.5−126∑
n=0
erf
(n+ 1√
2σ
), (1)
where erf(x) = 2√π
∫ x
0exp(−t2)dt, and σ is the SDIS of the ADC. Figure 1 depicts the curves of Eq. (1).
2.2 Multivariate statistical model of raw data
Suppose that f(x) is multivariate Gaussian distributed as
f(x) =1√
(2π)k|Cx|exp
{−1
2(x− μx)
TC−1x (x− μx)
}, (2)
1) Fritz T, Eineder M. Terra SAR-X ground segment basic product specification document. http://sss.terrasar-x.dlr.de/
pdfs/TX-GS-DD-3302.pdf. 2009
1282 Qi H M, et al. Sci China Inf Sci June 2012 Vol. 55 No. 6
0 100 200 300 400 500 6000
20
40
60
80
100
120
Standard deviation of the input signal
Ave
rage
sig
nal m
agni
tude
Figure 1 Relationship between ASM and SDIS.
Figure 2 (a) Bi-variant standard Gaussian distribution; (b) statistical bi-variant distribution of ERS-1 raw data.
where μx and Cx are the mean and covariance matrix of vector x , respectively, and k is the dimension
of x. Figure 2(a) shows the bi-variant standard Gaussian distribution, and Figure 2(b) is the statistical
bi-variant distribution of ERS-1 raw data, with 5-bit ADC.
3 General VQ codebook design
Vector quantization (VQ) is a well-known technique for signal compression, which is also the extension of
the scalar quantization. It fully takes advantages of: 1) both linear and non-linear dependencies among
vector components, 2) the probability density function shape of the source and 3) the dimensional of
the vector. VQ can be defined as a mapping from k dimensional Euclidean space R to the codebook Y ,
which can be represented as
yi = Q(y) for i = 1, 2, . . . , N. (3)
where x = [x1, x2, . . . , xk]T, yi = [yi1, yi2, . . . , yik]
T, Y = {y1,y2, . . . ,yN}T, and N is the length of the
codebook.
The flow chart of the codebook design process is shown in Figure 3. The differences between conven-
tional VQ and our algorithm are marked in the green blocks.
This paper uses mean-square error (MSE) as the distortion measure, which is defined as
D = E(‖x−Q(x)‖2) =N∑
i=1
E(‖x− yi‖2)) =N∑
i=1
∫
Ri
‖x− yi‖2p(x)dx. (4)
where f(x) is joint probability density function.
Qi H M, et al. Sci China Inf Sci June 2012 Vol. 55 No. 6 1283
Start
(Dm−1−Dm)/Dm<
Initial
Partition region
0C
iR
Centroid calculation ijC
Distortion calculation
Final codebook m=m+
ε
1
Probabilitydensity function
Yes No
End −4 −3 −2 −1 0 1 2 3 4−4
−3
−2
−1
0
1
2
3
4
R1
R4
R2
R5
R6
R3
R8
R16
R7
R15
R9
R13
R10
R12R11
R14
x
y
Figure 3 Flow chart of the codebook design.Figure 4 Practical example of partition two-dimen-
sional cell.
The main purpose of VQ design is to find a codebook that minimizes the value of D.
In the following, two main steps, which are called the nearest neighbor condition (NNC) and the
centroid condition (CC) [7] of vector quantization, are introduced.
Step 1 (NNC): with Y preassigned , the best partition of Ri can be written as
Ri = {x : ‖x− yi‖2 � ‖x− yj‖2, for all j �= i} and
N⋃
i=1
Ri = R. (5)
Rewrite Eq. (5) in the form:
Ri = {x : yij · x+ bij � 0, for all j �= i}. (6)
where yij = yi − yj , bij =12 (‖yi‖2 − ‖yj‖2),“ · ” denotes the inner product operation.
Let Hij denote the half-space. According to Eq. (6), Hij can be defined as
Hij = {x : yij · x+ bij � 0} (7)
and let Lij denote the hyper plan
Lij = {x : yij · x+ bij = 0}, j = 1, 2, . . . , N.
Then the Ri may be described as the intersection of half-spaces
⋂
j �=i
Hij
and each face of the polytope Ri must lie in Lij for some j. Figure 4 shows the partition of a two-
dimensional cell. Step 1 can guarantee optimal partition.
Step 2 (CC): Assuming that R is given, a necessary condition for D to be at a stationary value is [16]
∇Y D = 0N×k, (8)
1284 Qi H M, et al. Sci China Inf Sci June 2012 Vol. 55 No. 6
where ∇ is the gradient operator. Eq. (8) can be rewritten as
∇Y D =∂D
∂Y=
⎛
⎜⎜⎜⎜⎜⎝
∂D
∂y11. . .
∂D
∂y1k...
...∂D
∂yN1· · · ∂D
∂yNk
⎞
⎟⎟⎟⎟⎟⎠= 0N×k. (9)
According to Eq. (4), Eq. (9) can be rewritten as
∂D
∂yij= −2
∫
Ri
(xj − yij)p(x)dx = 0, i = 1, 2, . . . , N ; j = 1, 2, . . . ,K, (10)
and then
yij =
∫Ri
xj f(x) dx∫Ri
f(x) dx. (11)
The flow chart of the codebook design is shown in Figure 3. By iteratively implementing the two
steps, we can obtain a codebook design which is locally optimal at least. Since the codebook is obtained
considering the multidimensional statistical properties of the raw data, it can be generally used in practice.
4 Encoding and decoding scheme
Figure 5 shows the encoding and decoding scheme of the proposed algorithm. The new algorithm utilizes
the statistical characteristics of echo and divides them into sub-blocks of the same size. Calculate the
ASM of each sub block and obtain the SDIS to the ADC. After that, the sampled data is normalized by
the SDIS, and assigned by a corresponding codeword. At last, the index of the codeword is transmitted
to the ground data processing system for subsequent processing.
When decoding, use the codeword index and the codebook to get the corresponding output codeword.
According to the map of the ASM and SDIS, obtain the SDIS of the input signal and use the SDIS to
de-normalize the codeword.
5 Feasibility of the entropy coding and channel transmission robustness
Entropy is a measurement of information. According to the principles of information theory, information
entropy is the theoretical limit of data compression. If the new algorithm is followed by entropy coding,
with SNR fixed, the compression ratio of the vector quantization can be further increased. Therefore, we
analyse the feasibility of the new algorithm in cascade with entropy coding.
During transmission, due to the interference of channel noise, error may occur on the receiving side
of the channel, which would deteriorate the quantization, or even affect the application of the raw data.
Therefore, it’s necessary to analyze the robustness of the algorithm when error occurs during transmission.
5.1 Feasibility analysis of the vector quantization in cascade with entropy coding
First consider the feasibility of entropy coding. Figure 6 shows the flow chart of the vector quantization
in cascade with entropy coding. For k-dimensional vector quantization with the codebook length of N ,
its information entropy is defined as
H(x) = −N∑
i=1
p(xi) log2 p(xi). (12)
where p(xi) =∫Ri
f(x)dx is the integral of region Ri , which is equal to the probability of the corre-
sponding sub-space division.
Qi H M, et al. Sci China Inf Sci June 2012 Vol. 55 No. 6 1285
Figure 5 Vector quantization coding and decoding flowchart.
Figure 6 Flow chart of vector quantization in cascade with entropy coding.
Table 1 Average symbol entropy and Lossless compression ratio for two-dimension VQ
1bit 2bit 3bit 4bit 5bit 6bit
Average symbol entropy 0.9948 1.93727 2.91002 3.89174 4.8805 5.8168
Lossless compression ratio 1.005 1.0324 1.0309 1.0278 1.0245 1.0315
For simplicity, average entropy is used to represent the average information provided by each symbol,
which is
Hk(x) = −
N∑i=1
p(xi) log2 p(xi)
k. (13)
When the compression ratio is 8:2, average entropy of two-dimensional vector quantization is
H(x) = −16∑
i=1
p(xi) log2 p(xi) = 1.93727(bit/symbol)
It’s lossless compression ratio is 2/1.93727 = 1.0324.
Similarly, the corresponding results of 8:1, 8:3, 8:4, 8:5 and 8:6 two-dimensional vector quantization
are listed in Table 1.
1286 Qi H M, et al. Sci China Inf Sci June 2012 Vol. 55 No. 6
Figure 7 Flow chart of vector quantization based on channel noise.
As is shown in Table 1, lossless compression ratio of entropy coding is very small for two-dimensional
vector quantizer, and the complexity of the entropy coding makes it difficult to implement onboard.
Therefore, entropy coding is unnecessary in practical using.
5.2 Error analysis of channel transmission
As is shown in Figure 7, vector quantization encoder searches the codebook C = {y0,y1, . . . ,yN−1} for
the appropriate codeword yi which is the best match for the input vector x. The codeword index satisfies
i = arg min0�p�N−1
d(x, yp). (14)
The index of the codeword is transmitted to the receiver. If there is no channel noise, the receiver will
receive the index i, and then reconstruct the input vector x as yi from the LUT. If the channel is noisy,
the receiver may not receive the index i but receive the index m. Then the decoder will reconstruct the
input vector x as ym. Since ym is not the best match for the input vector, additional distortion will be
introduced during the decoding process.
In the transmission process, if error exists, it will not pass on because vector quantization is a fixed-
length encoding. For k-dimensional vector quantization, one bit error only leads to k samples mistake.
Take a satellite with error rate as 10−7 for example. Assuming the transmitted data size is 16384×16384
(range × azimuth), the number of the receiving error is
ceil(16384× 16384× 10−7) = 27.
For the two-dimensional vector quantization, the number of error samples is
2× 27 = 54.
Namely, due to the transmission channel error, after decoding on the ground, 54 samples have errors.
In raw data domain, for an M -bit ADC data, define the peaks distortion introduced by channel error as
PDSE =N × (2M − 1)2
Ka∑i
Kr∑j
(xij)2
. (15)
where Ka,Kr are azimuth and range sample number, respectively. N is the number of the error code.
Take the data from the ERS as an example. The size of the data is 5616× 16384 (range × azimuth) with
5-bit ADC, and two-dimensional vector quantization. If the satellite error rate is 10−7 , we can calculate
the number of the error samples as ceil(5616× 16384× 10−7)× 2 = 20. Substituting it into eq. (15), we
get PDSEers = 1.82× 10−5.
The additional distortion introduced by transmission channel is far less than that introduced by data
compression, so the system has a strong robustness for channel error.
5.3 The complexity of the VQ in realization
Take the 7-bit ASM and 8-bit ADC as an example. In practice, the encoding scheme is realized by the
Qi H M, et al. Sci China Inf Sci June 2012 Vol. 55 No. 6 1287
Raw data Pre-process Vectorquantization
Figure 8 The cascade scheme of vector quantization.
1 2
3 4 5
6 7 8
9 10 11 12 13 14 15 16
Figure 9 Imaging results of real SAR data.
look-up-table (LUT). For two-dimensional vector quantization, the depth of the LUT is 27+8+8 = 223. For
three-dimensional vector quantization, the depth of the LUT is 27+8+8+8 = 231. The current hardware
resources do not have enough storage capacity. Therefore, we need to use the cascade scheme shown in
Figure 8. The pre-process includes interception of high bits (IH) and so on. After that the data dimension
can be reduced, and the required depth of the LUT could reduce significantly.
For example, interception of the higher 4 bits in cascade with two-dimensional vector quantization
requires the corresponding codebook stored in an LUT with the depth of 27+4+4 = 215. As for the
three-dimensional, the depth is 27+4+4+4 = 219. The storage required has been reduced significantly.
6 Numerical experiments
6.1 Evaluation index
In order to verify the effectiveness of the new algorithm, this paper evaluates the compression performance
in data domain.
Define signal to noise ratio (SNR) as
SNR = 10 lg
⎡
⎢⎢⎢⎣
Ka∑i=1
Kr∑j=1
(zdij
)2
Ka∑i=1
Kr∑j=1
(zdij − zdij
)2
⎤
⎥⎥⎥⎦ , (16)
where Ka, Kr are the same as that in Eq. (15), zij is the original SAR raw data after ADC, and zij is
the reconstructed raw data.
6.2 Real SAR raw data introduction
The following experiments are based on real SAR data from the Institute of Electronics, the Chinese
Academy of Sciences (IECAS). The image is divided into 16 sub-blocks as shown in Figure 9. The
performance comparison of the conventional VQ and the new algorithm in data domain are shown in
Figure 10. The compression ratios are 8:2 and 8:3.
6.3 Performance in data domain
SNR curves in data domain are shown in Figure 10.
Since the codebook searching algorithm is very complex and the source of the hardware onboard is so
limited, codebook updating is difficult to be implemented onboard. Therefore, the codebook obtained
from the first block is used in all data encoding. As shown in Figure 10, the performance of the conven-
tional VQ is poor. The proposed algorithm utilizing the joint probability density function of the input
data, obtains a general codebook for SAR data. The performance of the new algorithm is better than
1288 Qi H M, et al. Sci China Inf Sci June 2012 Vol. 55 No. 6
1 2 4 6 8 10 12 14 168.0
8.5
9.0
9.5
10.0
SNR
(dB
)
SNR
(dB
)1 2 4 6 8 10 12 14 16
13.5
14.0
14.5
15.0
15.5
15.7
The number of sub-images The number of sub-images
Conventional VQ
Improved VQConventional VQ
Improved VQ
(a) (b)
Figure 10 SNR curves in data domain. (a)2bit compression; (b)3bit compression.
that of conventional VQ, except for the first block. Conventional VQ has poor adaptability but strong
pertinence. However, the joint probability density of the raw data may be slightly different from the ideal
situation. Therefore, SNR of the conventional VQ is a little higher than that of the new algorithm in the
first block.
7 Conclusions
To solve the problem of poor adaptability of conventional vector quantization (VQ), an improved VQ
algorithm is proposed based on the joint probability density function of the input data. Moreover, the
encoding and decoding scheme for raw data vector quantization are given. Besides, this paper analyses
the feasibility of the new algorithm in cascade with entropy coding, as well as the robustness of the
algorithm when error occurs during transmission. The results of real data show that codebook derived
from the new algorithm can be generally used and designed off-line, which makes VQ a practical algorithm
for space-borne SAR raw data compression.
Acknowledgements
This work was supported by Special Fund to the Winner of CAS Excellent Doctoral Dissertation President
Reward (Grant No. 0813260042), and National Key Laboratory of Microwave Imaging Technology Fund (Grant
No. 9140C1903041003). The authors would like to thank Professor Ian G. Cumming from the University of
British Columbia for his helpful suggestion, Professor John C. Curlander for his constructive suggestions, Dr.
Pietro Guccione from Politecnico di Bar for his lively discussion.
References
1 Kwok R, Johnson W T K. Block adaptive quantization of Magellan SAR data. IEEE Trans Geosci Remote, 1989, 27:
375–383
2 Boustani A E, Branham K, Kinsner W. A review of current raw SAR data compression techniques. In: Canadian
Conference on Electrical and Computer Engineering 2001. Toronto: IEEE, 2001. 925–930
3 Benz U, Strodl K, Moreria A. A comparison of several algorithms for SAR raw data compression. IEEE Trans Geosci
Remote, 1995, 33: 1266–1276
4 Snoeij P, Attema E, Guarnieri A M, et al. GMES Sentinel-1 FDBAQ performance analysis. In: 2009 IEEE Radar
Conference. California: IEEE, 2009. 1–6
5 Snoeij P, Attema E, Guarnieri A M, et al. FDBAQ a novel encoding scheme for Sentinel-1. In: Proceedings of IGARSS
2009. Cape Town, 2009. I-44-I-47
Qi H M, et al. Sci China Inf Sci June 2012 Vol. 55 No. 6 1289
6 Agrawal N, Venugopalan K. Amplitude phase algorithm for SAR signal processing. In: Proceedings of the 2009 1st
International Conference on Computational Intelligence, Communication Systems and Networks. Washington DC:
IEEE, 2009. 351–356
7 Linde Y, Buzo A, Gray R M. An algorithm for vector quantizer design. IEEE Trans Commun, 1980, 28: 84–95
8 Lebedeff D, Mathieu P, Barlaud E, et al. Adaptive vector quantization for raw SAR data. In: IEEE International
Conference on Acoustics, Speech, and Signal Processing. Detroit: IEEE, 1995. 2511–2514
9 Zhao D, Samuelsson J, Nilsson M. On entropy-constrained vector quantization using Gaussian mixture models. IEEE
Trans Commun, 2008, 56: 2094–2104
10 Fischer J, Benz U, Moreira A. Efficient SAR raw data compression in frequency domain. In: Proceedings of IGARSS99.
Hamburg: IEEE 1999. 2261–2263
11 Bhattacharya S, Blumensath T, Mulgrew B, et al. Fast encoding of synthetic aperture radar raw data using compressed
sensing. In: IEEE Workshop on Statistical Signal. Madison: IEEE, 2007. 448–452
12 Herman M A, Strohmer T. High-resolution radar via compressed sensing. IEEE Trans Signal Proces, 2009, 57: 2275–
2284
13 Qi H M, Yu W D. Study of effect of raw data compression on space-borne InSAR interferometry based on real data
(in Chinese). J Electron Inform Technol, 2008, 30: 2693–2697
14 Qi H M, Yu W D, Chen X. Piecewise linear mapping algorithm for SAR raw data compression. Sci China Ser F-Inf
Sci, 2008, 51: 2126–2134
15 Qi H M, Yu W D. Anti-saturation block adaptive quantization algorithm for SAR raw data compression over the whole
set of saturation degrees. Prog Nat Sci, 2009, 19: 1003–1009
16 Chen D T S. On two or more dimensional optimum quantizers. In: IEEE International Conference on Acoustics,
Speech, and Signal Processing. Hartford: IEEE, 1977. 640–643