UNIVERSITY OF BRITISH COLUMBIA
RESEARCH PROGRESS IN DEC 2014
Bambang AB Sarif
2
Summary
Problem Minimizing energy consumption of Video Sensor Network Previous work
complexity and bitrate model for different GOP size and motion estimation level ie block size candidates used
Current work incorporate the effect of QP and spatial information (SI) and temporal
information (TI) values into the model result
Complexity modeling correlation 0983 RMSE=786 mil instruction Bitrate modeling correlation 0977 RMSE=465 kbps (better than modified
ICIP 2014 paper 0927 and 9434)
Plan Write a journal paper on the model Incorporate the model into an optimization process Write the thesis
3
Video Sensor Networks
Minimizing energy consumption is very important- Encoding power consumption- Communication (transmission and
reception) power consumption Find the encoding configuration that
optimize the energy consumption
4
Our video datasets Different event settings office classroom party Different camera FoV Motion level varies per each camera and also during each shot (10s of video)
5
For each event (office classroom and party) we have 4 scenes from 9 cameras In total we have 108 videos Each video has different spatial information (SI) and temporal information (TI) (ITU-T Recommendation)
non-standard version uses mean value instead of max (ICIP 2011)
6
Complexity and Bitrate model
Power-Rate-Distortion model (Zhihai He et al IEEE Trans CSVT 2005)
Used in simulation of 9 video nodes where each node is assumed to have the same -2 (Yifeng He et al IEEE Trans CSVT 2009)
Marsquos Model (IEEE Trans CSVT 2012) Perceptual quality and bitrate model for different QP and frame rate Features used frame difference normalized frame difference MV displaced frame
difference motion activity intensity MV normalized by contrast MV normalized by intensity MV normalized by variance
encoding power efficiency given as a parameter in simulation
video variance
Rmax a and b are obtained using least square regression of features
7
Lottermannrsquos model (ICIP 2014) Follows Marsquos model but use non-standard spatial information unit (SI) and temporal
information unit (TI) 6 videos for training and 4 videos for test 120 select frames of videos where SI and TI values are stable QP from 24 until 45 step size 1 Frame rate 15 fps 10 fps 5 fps and 3 fps
Rmax a and b are estimated using least square regression with cross validation error from the features in the form of p1x1+ p2x2 +hellip + pnxn with xi - TI SI log(TI) log(SI) SITI log(SITI)
Rmax = 08149 TISI + 1394 a = 20123 log(SI) ndash 00004 TI SI ndash 04616 b = 01334 log(SITI) ndash 03072
8
Our Model QP is from 28 until 40 with step size of 2 Frame rate is 15 fps but GOP size varies=1248163264
Note the increase of complexity (and decrease of bitrate) between GOP size 32 and 64 is very small
Motion estimation level is defined as follow
9
Complexity model
Bitrate model
f(GOP) = -2log(GOP)
For f(-ML) we check three different functions
CI CP -1 and - are estimated from the training set using the same features used by the Lottermann model
RI RP - and parameters for f(-ML) and f(GOP) are estimated from the training set using the same features used by the Lottermann model
The one used in our IARIA paper However in that paper the value of -3 is not derived from SITI
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
2
Summary
Problem Minimizing energy consumption of Video Sensor Network Previous work
complexity and bitrate model for different GOP size and motion estimation level ie block size candidates used
Current work incorporate the effect of QP and spatial information (SI) and temporal
information (TI) values into the model result
Complexity modeling correlation 0983 RMSE=786 mil instruction Bitrate modeling correlation 0977 RMSE=465 kbps (better than modified
ICIP 2014 paper 0927 and 9434)
Plan Write a journal paper on the model Incorporate the model into an optimization process Write the thesis
3
Video Sensor Networks
Minimizing energy consumption is very important- Encoding power consumption- Communication (transmission and
reception) power consumption Find the encoding configuration that
optimize the energy consumption
4
Our video datasets Different event settings office classroom party Different camera FoV Motion level varies per each camera and also during each shot (10s of video)
5
For each event (office classroom and party) we have 4 scenes from 9 cameras In total we have 108 videos Each video has different spatial information (SI) and temporal information (TI) (ITU-T Recommendation)
non-standard version uses mean value instead of max (ICIP 2011)
6
Complexity and Bitrate model
Power-Rate-Distortion model (Zhihai He et al IEEE Trans CSVT 2005)
Used in simulation of 9 video nodes where each node is assumed to have the same -2 (Yifeng He et al IEEE Trans CSVT 2009)
Marsquos Model (IEEE Trans CSVT 2012) Perceptual quality and bitrate model for different QP and frame rate Features used frame difference normalized frame difference MV displaced frame
difference motion activity intensity MV normalized by contrast MV normalized by intensity MV normalized by variance
encoding power efficiency given as a parameter in simulation
video variance
Rmax a and b are obtained using least square regression of features
7
Lottermannrsquos model (ICIP 2014) Follows Marsquos model but use non-standard spatial information unit (SI) and temporal
information unit (TI) 6 videos for training and 4 videos for test 120 select frames of videos where SI and TI values are stable QP from 24 until 45 step size 1 Frame rate 15 fps 10 fps 5 fps and 3 fps
Rmax a and b are estimated using least square regression with cross validation error from the features in the form of p1x1+ p2x2 +hellip + pnxn with xi - TI SI log(TI) log(SI) SITI log(SITI)
Rmax = 08149 TISI + 1394 a = 20123 log(SI) ndash 00004 TI SI ndash 04616 b = 01334 log(SITI) ndash 03072
8
Our Model QP is from 28 until 40 with step size of 2 Frame rate is 15 fps but GOP size varies=1248163264
Note the increase of complexity (and decrease of bitrate) between GOP size 32 and 64 is very small
Motion estimation level is defined as follow
9
Complexity model
Bitrate model
f(GOP) = -2log(GOP)
For f(-ML) we check three different functions
CI CP -1 and - are estimated from the training set using the same features used by the Lottermann model
RI RP - and parameters for f(-ML) and f(GOP) are estimated from the training set using the same features used by the Lottermann model
The one used in our IARIA paper However in that paper the value of -3 is not derived from SITI
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
3
Video Sensor Networks
Minimizing energy consumption is very important- Encoding power consumption- Communication (transmission and
reception) power consumption Find the encoding configuration that
optimize the energy consumption
4
Our video datasets Different event settings office classroom party Different camera FoV Motion level varies per each camera and also during each shot (10s of video)
5
For each event (office classroom and party) we have 4 scenes from 9 cameras In total we have 108 videos Each video has different spatial information (SI) and temporal information (TI) (ITU-T Recommendation)
non-standard version uses mean value instead of max (ICIP 2011)
6
Complexity and Bitrate model
Power-Rate-Distortion model (Zhihai He et al IEEE Trans CSVT 2005)
Used in simulation of 9 video nodes where each node is assumed to have the same -2 (Yifeng He et al IEEE Trans CSVT 2009)
Marsquos Model (IEEE Trans CSVT 2012) Perceptual quality and bitrate model for different QP and frame rate Features used frame difference normalized frame difference MV displaced frame
difference motion activity intensity MV normalized by contrast MV normalized by intensity MV normalized by variance
encoding power efficiency given as a parameter in simulation
video variance
Rmax a and b are obtained using least square regression of features
7
Lottermannrsquos model (ICIP 2014) Follows Marsquos model but use non-standard spatial information unit (SI) and temporal
information unit (TI) 6 videos for training and 4 videos for test 120 select frames of videos where SI and TI values are stable QP from 24 until 45 step size 1 Frame rate 15 fps 10 fps 5 fps and 3 fps
Rmax a and b are estimated using least square regression with cross validation error from the features in the form of p1x1+ p2x2 +hellip + pnxn with xi - TI SI log(TI) log(SI) SITI log(SITI)
Rmax = 08149 TISI + 1394 a = 20123 log(SI) ndash 00004 TI SI ndash 04616 b = 01334 log(SITI) ndash 03072
8
Our Model QP is from 28 until 40 with step size of 2 Frame rate is 15 fps but GOP size varies=1248163264
Note the increase of complexity (and decrease of bitrate) between GOP size 32 and 64 is very small
Motion estimation level is defined as follow
9
Complexity model
Bitrate model
f(GOP) = -2log(GOP)
For f(-ML) we check three different functions
CI CP -1 and - are estimated from the training set using the same features used by the Lottermann model
RI RP - and parameters for f(-ML) and f(GOP) are estimated from the training set using the same features used by the Lottermann model
The one used in our IARIA paper However in that paper the value of -3 is not derived from SITI
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
4
Our video datasets Different event settings office classroom party Different camera FoV Motion level varies per each camera and also during each shot (10s of video)
5
For each event (office classroom and party) we have 4 scenes from 9 cameras In total we have 108 videos Each video has different spatial information (SI) and temporal information (TI) (ITU-T Recommendation)
non-standard version uses mean value instead of max (ICIP 2011)
6
Complexity and Bitrate model
Power-Rate-Distortion model (Zhihai He et al IEEE Trans CSVT 2005)
Used in simulation of 9 video nodes where each node is assumed to have the same -2 (Yifeng He et al IEEE Trans CSVT 2009)
Marsquos Model (IEEE Trans CSVT 2012) Perceptual quality and bitrate model for different QP and frame rate Features used frame difference normalized frame difference MV displaced frame
difference motion activity intensity MV normalized by contrast MV normalized by intensity MV normalized by variance
encoding power efficiency given as a parameter in simulation
video variance
Rmax a and b are obtained using least square regression of features
7
Lottermannrsquos model (ICIP 2014) Follows Marsquos model but use non-standard spatial information unit (SI) and temporal
information unit (TI) 6 videos for training and 4 videos for test 120 select frames of videos where SI and TI values are stable QP from 24 until 45 step size 1 Frame rate 15 fps 10 fps 5 fps and 3 fps
Rmax a and b are estimated using least square regression with cross validation error from the features in the form of p1x1+ p2x2 +hellip + pnxn with xi - TI SI log(TI) log(SI) SITI log(SITI)
Rmax = 08149 TISI + 1394 a = 20123 log(SI) ndash 00004 TI SI ndash 04616 b = 01334 log(SITI) ndash 03072
8
Our Model QP is from 28 until 40 with step size of 2 Frame rate is 15 fps but GOP size varies=1248163264
Note the increase of complexity (and decrease of bitrate) between GOP size 32 and 64 is very small
Motion estimation level is defined as follow
9
Complexity model
Bitrate model
f(GOP) = -2log(GOP)
For f(-ML) we check three different functions
CI CP -1 and - are estimated from the training set using the same features used by the Lottermann model
RI RP - and parameters for f(-ML) and f(GOP) are estimated from the training set using the same features used by the Lottermann model
The one used in our IARIA paper However in that paper the value of -3 is not derived from SITI
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
5
For each event (office classroom and party) we have 4 scenes from 9 cameras In total we have 108 videos Each video has different spatial information (SI) and temporal information (TI) (ITU-T Recommendation)
non-standard version uses mean value instead of max (ICIP 2011)
6
Complexity and Bitrate model
Power-Rate-Distortion model (Zhihai He et al IEEE Trans CSVT 2005)
Used in simulation of 9 video nodes where each node is assumed to have the same -2 (Yifeng He et al IEEE Trans CSVT 2009)
Marsquos Model (IEEE Trans CSVT 2012) Perceptual quality and bitrate model for different QP and frame rate Features used frame difference normalized frame difference MV displaced frame
difference motion activity intensity MV normalized by contrast MV normalized by intensity MV normalized by variance
encoding power efficiency given as a parameter in simulation
video variance
Rmax a and b are obtained using least square regression of features
7
Lottermannrsquos model (ICIP 2014) Follows Marsquos model but use non-standard spatial information unit (SI) and temporal
information unit (TI) 6 videos for training and 4 videos for test 120 select frames of videos where SI and TI values are stable QP from 24 until 45 step size 1 Frame rate 15 fps 10 fps 5 fps and 3 fps
Rmax a and b are estimated using least square regression with cross validation error from the features in the form of p1x1+ p2x2 +hellip + pnxn with xi - TI SI log(TI) log(SI) SITI log(SITI)
Rmax = 08149 TISI + 1394 a = 20123 log(SI) ndash 00004 TI SI ndash 04616 b = 01334 log(SITI) ndash 03072
8
Our Model QP is from 28 until 40 with step size of 2 Frame rate is 15 fps but GOP size varies=1248163264
Note the increase of complexity (and decrease of bitrate) between GOP size 32 and 64 is very small
Motion estimation level is defined as follow
9
Complexity model
Bitrate model
f(GOP) = -2log(GOP)
For f(-ML) we check three different functions
CI CP -1 and - are estimated from the training set using the same features used by the Lottermann model
RI RP - and parameters for f(-ML) and f(GOP) are estimated from the training set using the same features used by the Lottermann model
The one used in our IARIA paper However in that paper the value of -3 is not derived from SITI
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
6
Complexity and Bitrate model
Power-Rate-Distortion model (Zhihai He et al IEEE Trans CSVT 2005)
Used in simulation of 9 video nodes where each node is assumed to have the same -2 (Yifeng He et al IEEE Trans CSVT 2009)
Marsquos Model (IEEE Trans CSVT 2012) Perceptual quality and bitrate model for different QP and frame rate Features used frame difference normalized frame difference MV displaced frame
difference motion activity intensity MV normalized by contrast MV normalized by intensity MV normalized by variance
encoding power efficiency given as a parameter in simulation
video variance
Rmax a and b are obtained using least square regression of features
7
Lottermannrsquos model (ICIP 2014) Follows Marsquos model but use non-standard spatial information unit (SI) and temporal
information unit (TI) 6 videos for training and 4 videos for test 120 select frames of videos where SI and TI values are stable QP from 24 until 45 step size 1 Frame rate 15 fps 10 fps 5 fps and 3 fps
Rmax a and b are estimated using least square regression with cross validation error from the features in the form of p1x1+ p2x2 +hellip + pnxn with xi - TI SI log(TI) log(SI) SITI log(SITI)
Rmax = 08149 TISI + 1394 a = 20123 log(SI) ndash 00004 TI SI ndash 04616 b = 01334 log(SITI) ndash 03072
8
Our Model QP is from 28 until 40 with step size of 2 Frame rate is 15 fps but GOP size varies=1248163264
Note the increase of complexity (and decrease of bitrate) between GOP size 32 and 64 is very small
Motion estimation level is defined as follow
9
Complexity model
Bitrate model
f(GOP) = -2log(GOP)
For f(-ML) we check three different functions
CI CP -1 and - are estimated from the training set using the same features used by the Lottermann model
RI RP - and parameters for f(-ML) and f(GOP) are estimated from the training set using the same features used by the Lottermann model
The one used in our IARIA paper However in that paper the value of -3 is not derived from SITI
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
7
Lottermannrsquos model (ICIP 2014) Follows Marsquos model but use non-standard spatial information unit (SI) and temporal
information unit (TI) 6 videos for training and 4 videos for test 120 select frames of videos where SI and TI values are stable QP from 24 until 45 step size 1 Frame rate 15 fps 10 fps 5 fps and 3 fps
Rmax a and b are estimated using least square regression with cross validation error from the features in the form of p1x1+ p2x2 +hellip + pnxn with xi - TI SI log(TI) log(SI) SITI log(SITI)
Rmax = 08149 TISI + 1394 a = 20123 log(SI) ndash 00004 TI SI ndash 04616 b = 01334 log(SITI) ndash 03072
8
Our Model QP is from 28 until 40 with step size of 2 Frame rate is 15 fps but GOP size varies=1248163264
Note the increase of complexity (and decrease of bitrate) between GOP size 32 and 64 is very small
Motion estimation level is defined as follow
9
Complexity model
Bitrate model
f(GOP) = -2log(GOP)
For f(-ML) we check three different functions
CI CP -1 and - are estimated from the training set using the same features used by the Lottermann model
RI RP - and parameters for f(-ML) and f(GOP) are estimated from the training set using the same features used by the Lottermann model
The one used in our IARIA paper However in that paper the value of -3 is not derived from SITI
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
8
Our Model QP is from 28 until 40 with step size of 2 Frame rate is 15 fps but GOP size varies=1248163264
Note the increase of complexity (and decrease of bitrate) between GOP size 32 and 64 is very small
Motion estimation level is defined as follow
9
Complexity model
Bitrate model
f(GOP) = -2log(GOP)
For f(-ML) we check three different functions
CI CP -1 and - are estimated from the training set using the same features used by the Lottermann model
RI RP - and parameters for f(-ML) and f(GOP) are estimated from the training set using the same features used by the Lottermann model
The one used in our IARIA paper However in that paper the value of -3 is not derived from SITI
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
9
Complexity model
Bitrate model
f(GOP) = -2log(GOP)
For f(-ML) we check three different functions
CI CP -1 and - are estimated from the training set using the same features used by the Lottermann model
RI RP - and parameters for f(-ML) and f(GOP) are estimated from the training set using the same features used by the Lottermann model
The one used in our IARIA paper However in that paper the value of -3 is not derived from SITI
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
10
For comparison we modify the Lottermann model to include -ML Complexity model
Bitrate model
CI a b and c are estimated from the training set using the same features used by the Lottermann model
RI d e and f are estimated from the training set using the same features used by the Lottermann model
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
11
Training 27 videos (office_1 classroom_1 party_1) test 81 videos Results compared to modified Lottermann model (ICIP 2014)
Noticed few things The bitrate estimation error is significantly lower if we use non-standard SITI If we use standard SITI the above result is the best If we use different training set (ie office_4
classroom_3 and party_2) the result is worse or even bad especially in of error If the non-standard SITI is used (ICIP 2011) the result doesnrsquot change too much regardless of which
training set I use
Note The papers in IEEE Trans CVST and ICIP that I use as reference do not compare of error They only
provide the PC (Pearson Correlation) coefficient and RMSE
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
12
Complexity for different ML and GOP size (QP=28) office_2 cam1 video
Bitrate for different QP and GOP size office_2 cam1 video
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014
13
References ITU-R ldquoP910 Subjective video quality assessment methods for multimedia applicationsrdquo Tech Rep P910
ITU-R (1992) Zhihai He Yongfang Liang Lulin Chen Ishfaq Ahmad and Dapeng Wu ldquoPower-Rate-Distortion Analysis for
Wireless Video Communication Under Energy Constraintsrdquo IEEE Trans CSVT Vol 15 No 5 May 2005 Zhihai He and Dapeng Wu ldquoResource Allocation and Performance Analysis of Wireless Video Sensorsrdquo
IEEE Trans CSVT Vol 16 No 5 May 2006 Yifeng He Ivan Lee and Ling Guan ldquoDistributed Algorithms for Network Lifetime Maximization in Wireless
Visual Sensor Networksrdquo IEEE Trans CSVT Vol 19 No 5 May 2009 Yang Peng and Eckehard Steinbach A Novel Full-reference Video Quality Metric and its Application to
Wireless Video Transmission ICIP 2011 Yen-Fu Ou Zhan Ma Tao Liu and Yao Wang Perceptual Quality Assessment of Video Considering Both
Frame Rate and Quantization Artifacts IEEE Trans CSVT Vol 21 No 3 March 2011 Zhan Ma Meng Xu Yen-Fu Ou and Yao Wang ldquoModeling of Rate and Perceptual Quality of Compressed
Video as Functions of Frame Rate and Quantization Stepsize and Its Applicationsrdquo IEEE Trans CSVT Vol 22 No 5 May 2012
Christian Lottermann Alexander Machado Damien Schroeder Yang Peng and Eckehard Steinbach ldquoBit Rate Estimation for H264AVC Video Encoding Based on Temporal and Spatial Activitiesrdquo ICIP 2014