Project Proposal for EE 5359: Multimedia Processing
Investigation of Image Quality of Dirac, H.264 and H.265Biju Shrestha (UTA ID: 1000113697 Email: [email protected])
The University of Texas at Arlington416 Yates Street, Arlington, Texas 76019-0016
Acronyms and Abbreviations
AVC advanced video coding
BBC British Broadcasting Corporation
CBR constant bit rate
CODEC coder and decoder
FRExt fidelity range extensions
FSIM featured similarity index
GM gradient magnitude
HEVC high efficiency video coding
HVS human visual system
IEC international electrotechnical commission
ISO international organization for standardization
IST integer sine transform
ITU-T international telecommunication union - telecommunication standardization sector
JPEG joint photographic experts group
LIVE laboratory for image and video engineering
MICT media information and communication technology laboratory
MPEG moving picture experts group
MSE mean squared error
Project Proposal for EE 5359: Multimedia Processing
MSU Moscow State University
PC phase congruency
PSNR peak signal to noise ratio
RGB red, green and blue
SSIM structural similarity metric
TID200
8 Tampere image database 2008
VBR variable bit rate
VCEG video coding experts group
Abstract
There exist several standards for video compression with additional improvements in
performance and qualities in comparison to their older versions [2]. This project proposes to
investigate the image quality of Dirac, H.264 and H.265 using metrics like SSIM, FSIM and
bitrate [3, 5, and 7] using various test sequences. The conventional metrics like PSNR and MSE
are a measure of intensity and cannot measure the subjective fidelity [3]. In this project, PSNR
and MSE will be of little interest.
Introduction
Video codec is a tool which is used to compress and decompress the digital video [2]. There are
several types of video compression methods. Few of them that are going to be discussed in this
project are Dirac, H.264 and H.265 [1-3].
Project Proposal for EE 5359: Multimedia Processing
Dirac
Dirac video codec was initially developed by BBC Research [1]. It is an open source software
project and is powerful and flexible despite using only small number of core tools [1]. The
several features that Dirac offers are [1]:
Multi-resolution transforms
Inter and intra frame coding
Frame and field coding
Dual syntax
CBR and VBR operations
Variable bit depths.
Multiple chroma sampling formats
Lossless and lossy coding
Choice of wavelet filters
Simple stream navigation
Dirac has three main strands [15]. First is a compression specification for the byte stream and the
decoder [15]. Second is software for compression and decompression and third are the
algorithms designed to support simple and efficient hardware implementations [15]. Dirac
despite being similar to many video coding systems had additionally adopted the combined
effectiveness, efficiency and simplicity. The decoder and encoder architectures of Dirac are
Project Proposal for EE 5359: Multimedia Processing
shown respectively in figures 1 and 2.
Figure 1. Dirac decoder architecture [18]
Figure 2. Dirac encoder architecture [15]
Project Proposal for EE 5359: Multimedia Processing
H.264
H.264 is also referred as AVC and it is a standard for video compression [2]. H.264/MPEG-4
AVC is one of the international video coding standards jointly developed by the VCEG of the
ITU-T and the MPEG of ISO/IEC [11]. It provides enhanced coding efficiency for a wide range
of applications like video telephony, video conferencing, TV, storage, streaming video, digital
video authoring, digital cinema, etc. [11]. In addition, the FRExt provides enhanced capabilities
relative to the base specification [11].
H.264 does not have a predefined CODEC but has the predefined syntax for decoding and
encoding bit stream as shown in figures 3 and 4 respectively [1]. The various profiles of H.264
are shown in figure 5.
Figure 3. H.264 decoder [2]
Figure 4. H.264 encoder [2]
Project Proposal for EE 5359: Multimedia Processing
Figure 5. Various profile of H.264 [12]
H.265
H.265 is also known as HEVC [3] and it can deliver significantly improved compression
performance relative to that of the AVC (ITU-T H.264 | ISO/IEC 14496-10) [10]. Alshina et al
[16] investigated the coding efficiency with high resolution, HD 1080p, and concluded that it can
be progressed by average 37% and 36% bit savings for hierarchical B structure and IPPP
structure when compared to MPEG-4 AVC [16]. The typical block-based video codec is
composed of many processes including intra prediction and inter prediction, transforms,
quantization, entropy coding, and filtering [17] as shown in Figure 6. Over the decade, video
coding techniques have gone through intensive research to achieve higher coding efficiencies
[17].
Project Proposal for EE 5359: Multimedia Processing
Figure 6. Encoder block diagram of H.265. Grey boxes are proposed tools and white boxes are
H.264/AVC tools [17]
Image Quality Assessment using SSIM and FSIM
Digital images and videos are prone to different kinds of distortions during different phases like
acquisition, processing, compression, storage, transmission, and reproduction [5]. This
degradation results in poor visual quality. There are several metrics which are widely used to
quantify the image quality like FSIM, SSIM, bitrates, PSNR and MSE [3, 8, 13, 14]. This project
will primarily focus on metrics like SSIM, FSIM and bitrates. The other conventional metrics
like PSNR and MSE will not be measured as they are directly dependent on the intensity of an
image and do not correlate with the subjective fidelity ratings [3]. MSE cannot model the human
visual system very accurately [4].The measured parameters like FSIM and SSIM of Dirac,
H.264, and H.265 will be compared to study their comparative characteristics and make
conclusions.
Project Proposal for EE 5359: Multimedia Processing
SSIM is the quality assessment of an image based on the degradation of structural information
[5]. The SSIM takes an approach that the human visual system is adapted to extract structural
information from images [14]. Thus, it is important to retain the structural signal for image
fidelity measurement. Figure 7 shows the difference between nonstructural and structural
distortions. The nonstructural distortions are changes in parameter like luminance, contrast,
gamma distortion, and spatial shift and are usually caused by environmental and instrumental
conditions occurred during image acquisition and display [14]. On the other hand, structural
distortion embraces additive noise, blur, and lossy compression [14]. The structural distortions
change the structure of an image [14]. Figure 8 explains the measurement system used in the
calculation of SSIM.
Figure 7. Difference between nonstructural and structural distortions [14]
Project Proposal for EE 5359: Multimedia Processing
Figure 8. Block diagram of SSIM measurement system [5]
SSIM is based on the evaluation of three different metrics like luminance, contrast, and structure
which are described mathematically by equations (1), (2), and (3) respectively [7].
--------------------------------------------- (1)
--------------------------------------------- (2)
--------------------------------------------- (3)
Here,
µx and µy = local sample means of x and y respectively
σx and σy = local sample standard deviations of x and y respectively
σxy = local sample correlation coefficient between x and y
C1, C2, and C3 = constants that stabilize the computations when denominators become small
Project Proposal for EE 5359: Multimedia Processing
General form of SSIM index can be obtained by combining equations (1), (2) and (3) [7].
------------------------ (4)
Here, α, β, and γ are parameters that mediate the relative importance of those three
components. Using α = β = γ = 1. We get [7],
------------------------ (5)
Figure 9 shows the different distorted images which are quantified using MSE and SSIM. It is
clearly visible that the different images are of different quality based on human visual system
(HVS). However, all the distorted images have approximately same MSE, whereas SSIM is less
for poor quality image giving much better image quality indication than that of MSE.
Project Proposal for EE 5359: Multimedia Processing
(a) OriginalMSE = 0; SSIM = 1
(b) Mean luminance shiftMSE = 144, SSIM = 0.988
(c) Contrast stretchMSE = 144, SSIM = 0.913
(d)Impulse noise contamination
MSE = 144, SSIM = 0.840
(e)BlurringMSE = 144, SSIM =
0.694
(f) JPEG compressionMSE = 142, SSIM =
0.662
Figure 9. MSE and SSIM measurement of images under different distortions. (a) original image,
(b) mean luminance shift, (c) contrast stretch, (d) impulse noise contamination, (e) blurring, and
(f) JPEG [22] compression [13]
FSIM is based on the fact that HVS understands an image mainly according to its low-level
features [3]. PC is a dimensionless measure of the significance of a local structure [3]. PC and
image GM measurements are used as primary and secondary feature respectively in FSIM [3].
FSIM score is calculated by applying PC as a weighting function on the image local quality
characterized by PC and GM [3]. FSIM is designed for gray-scale images [3] and FSIMc
Project Proposal for EE 5359: Multimedia Processing
incorporates the chrominance information. FSIM can be mathematically modeled as shown in
equation 6 [3].
---------------------- (6)
Here, SL(x) = overall similarity between reference image and distorted image
FSIMc can be mathematically modeled as shown in equation 7 and the computation process is
illustrated in Figure 10 [3].
---------------------- (7)
Here, λ > 0 is the parameter used to adjust the importance of the chrominance components.
Figure 10. Illustration for FSIM/FSIMc index computation. f1 is the reference image, and f2 is a
distorted version of f1 [3].
Project Proposal for EE 5359: Multimedia Processing
All the metrics use different approaches to compare the images quantitavely. This different
approach makes one method different from another. Table 1 shows the ranking of image quality
assessment metric performance on six databases. It can be seen from Table 1 that FSIM is better
than SSIM and SSIM is better than PSNR when implementing an image quality assessment.
Table 1. Ranking of image quality assessment metrics performance (FSIM, SSIM and PSNR) on
six databases [3].
TID2008 CSIQ LIVE IVC MICT A57FSIM 1 1 1 1 1 1SSIM 2 2 2 2 2 2PSNR 3 3 3 3 3 3
Conclusions
The project is aimed in studying the qualitative performances of different video codecs with a
primary focus on Dirac, H.264 and H.265 [19 – 21]. Different parameters like SSIM, FSIM, and
bitrates will be measured for all three video codecs to make a comparative study. Based on
various test sequences of different spatial/temporal resolutions, MATLAB, Microsoft visual
studio, and MSU video quality measurement tools [26] will be extensively used to perform
image quality assessment of different codecs at various bit rates.
References
[1] Dirac Video (2008, September 23), “Dirac Specification” [Online]. Available:
http://diracvideo.org/download/specification/dirac-spec-latest.pdf
[2] I. Richardson (2011), “A Technical Introduction to H.264/AVC” [Online]. Available:
http://www.vcodex.com/files/H.264_technical_introduction.pdf
Project Proposal for EE 5359: Multimedia Processing
[3] L. Zhang, L. Zhang, X. Mou, and D. Zhang, “FSIM: A feature similarity index for image
quality assessment,” IEEE Transactions on Image Processing, vol.20, no.8, pp.2378-
2386, Aug. 2011.
[4] Z.Li and A.M. Tourapis, “New video quality metrics in the H.264 reference software,”
Input Document to JVT, Hannover, DE, 20-25 Jul. 2008.
[5] Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli,“Image quality assessment: from
error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13,
issue 4, pp. 600-612, Apr. 2004.
[6] Z. Wang, E.P. Simoncelli, and A.C. Bovik, “Multiscale structural similarity for image
quality assessment,” Conference Record of the Thirty-Seventh Asilomar Conference on
Signals, Systems and Computers, 2003, vol.2, pp. 1398- 1402, 9-12 Nov. 2003.
[7] C. Li, and A. C. Bovik, “Content-weighted video quality assessment using a three-
component image model.” Journal of Electronic Imaging, vol.19, pp. 65-71, Mar. 2010.
[8] X. Ran and N. Farvardin, “A perceptually-motivated three-component image model - part
I: description of the model,” IEEE Transactions on Image Processing, vol.4, no.4,
pp.401-415, Apr. 1995.
[9] J. L. Li, G. Chen, and Z. R. Chi, “Image coding quality assessment using fuzzy integrals
with a three-component image model,” IEEE Transactions on Fuzzy Systems, vol.12,
no.1, pp. 99- 106, Feb. 2004.
Project Proposal for EE 5359: Multimedia Processing
[10] G. J. Sullivan and J. Ohm, “Recent developments in standardization of high efficiency
video coding (HEVC),” Proc. SPIE 7798, 77980V, 2010.
[11] G. Sullivan, P. Topiwalla, and A. Luthra, “The H.264/AVC video coding standard:
overview and introduction to the fidelity range extensions,” SPIE Conference on
Applications of Digital Image Processing XXVII, vol. 5558, pp. 53-74, Aug. 2004.
[12] A. Puri, X. Chen, and A. Luthra, “Video coding using the H.264/MPEG-4 AVC
compression standard,” Signal Processing: Image Communication, vol. 19, pp. 793-849,
Oct. 2004.
[13] Z. Wang et al (2003, February), “The SSIM index for image quality assessment”
[Online]. Available: https://ece.uwaterloo.ca/~z70wang/research/ssim/
[14] C. Chukka, “A universal image quality index and SSIM comparison” [Online]. Available:
http://www-ee.uta.edu/Dip/Courses/EE5359/chaitanyaee5359d.pdf
[15] BBC Research, “The technology behind Dirac” [Online]. Available:
http://www.bbc.co.uk/rd/projects/dirac/technology.shtml
[16] E. Alshina et al, “Technical considerations of new challenges in video coding
standardization,” International Organization for Standardization Organization
Internationale De Normalisation ISO/IEC JTC1/SC29/WG11 Coding of Moving Pictures
and Audio, Oct. 2008.
[17] S. Jeong et al, “Highly efficient video codec for entertainment quality,” ETRI Journal,
vol.33, no. 2, pp. 145-154, Apr. 2011.
Project Proposal for EE 5359: Multimedia Processing
[18] K. R. Rao and D. N. Kim, “Current video coding standards: H.264/AVC, Dirac, AVS
China and VC-1,” 42nd Southeastern Symposium on System Theory (SSST), pp.1-8,
Mar. 2010.
[19] A. M. Tourapis (January 2009), “H.264/14496-10 AVC reference software manual”
[Online]. Available: http://iphome.hhi.de/suehring/tml/JM%20Reference%20Software
%20Manual%20%28JVT-AE010%29.pdf
[20] F. Bossen, D. Flynn, and K. Sühring (July 2011), “HEVC reference software manual”
[Online]. Available:
http://phenix.int-evry.fr/jct/doc_end_user/documents/6_Torino/wg11/JCTVC-F634-
v2.zip
[21] DiracPRO software: http://dirac.kw.bbc.co.uk/download/
[22] D. T. Lee, “JPEG 2000: Retrospective and new developments,” Proc. IEEE, vol. 93, pp.
32-41, Jan. 2005.
[23] KTA software: http://iphome.hhi.de/suehring/tml/download/KTA/
[24] H.264/AVC Reference Software: http://iphome.hhi.de/suehring/tml/download/
[25] A. Ravi, “Performance analysis and comparison of the Dirac video codec with
H.264/MPEG-4 part 10 AVC,” M.S. thesis, Dept. Elect. Eng., Univ. of Texas at
Arlington, 2009
[25] I.E.G. Richardson, “H.264 and MPEG-4 video compression: video coding for next generation multimedia,” Great Britain: Wiley, 2003, pp. 159-223
[26] MSU video quality measurement tool:
http://compression.ru/video/quality_measure/video_measurement_tool_en.html