Post on 11-Jan-2016
description
transcript
UCLA ONR 1
ONR PresentationAugust 4, 2005
John Villasenorvilla@icsl.ucla.edu
David Choi, Hyungjin Kim, Dong-U Leedschoi@icsl.ucla.edu, hjkimnov@ee.ucla.edu,
dongu@icsl.ucla.edu
UCLA ONR 2
Overview
• Focus of work to date– Robust imagery
– Automatic adaptation to network parameters
– Region of interest (ROI) coding, including integration of target tracking information
– Improved imagery using system-wide optimizations including channel coding, image coding, receiver signal detection
– Several generations of embedded hardware platform, integration and deployment on helicopter platforms.
UCLA ONR 3
Recent Efforts
• Multi-layered video streams with a base and enhancement later• Region of interest coding using a base/enhancement layer system• Reduced complexity image representation for environments with
severe power constraints• Inherently secure encoding• System-level optimizations (current focus: timing recovery) to
improve end to end imaging capabilities
UCLA ONR 4
Enhancement Layer Video
• Concept: In networks with communication links of varying bandwidths and capacities, – Send base layer video to all clients using legacy standards-based video
codec implementation
– Send enhancement layer video selectively to clients that can support the additional bandwidths.
Video EncodeCamera
High Bandwidth Client
High Bandwidth Client
Low bandwidth
Client
Low bandwidth
Client
Low bandwidth
ClientLow
bandwidthClient
base
base
basebase
baseba
se
enhancement
enhancement
*in collaboration with Innovative Concepts
UCLA ONR 5
Enhancement Layer Encoding/Decoding
• Encoder: Leverage standards-based encoding platform
Video Encoder
Original FrameCompressedBase Layer
-
Difference Frame
Video Encoder
CompressedEnhancement Layer
Decoder:CompressedBase Layer
Compressed Enhancement Layer
+
Video Decoder
Video Decoder
Video Decoder
Recovered Frame
Base LayerFrame
Base LayerFrame
Enhancement Layer Frame
UCLA ONR 6
Enhancement Coding Example
• Enhancement layer coding can provide baseline imagery to low bandwidth clients and higher quality imagery to high bandwidth clients
Base Layer Frame Enhancement LayerFrame
Base with Enhancement Combined Frame
UCLA ONR 7
Improvement in Video quality through Enhancement Layer Video
PSNR vs. Base + Enhancement bitrate
Points along a curve show improvement through enhancement layer
Different curves represent different starting base layer bitrates
Simulation performed with 320x240 video
UCLA ONR 8
PSNR Difference Plot
Difference Plots
0 500 1000 1500 2000 2500-2
-1
0
1
2
3
4
5
6
7
8
Enhancement layer bitrate
PS
NR
(enh
+ba
se)
- P
SN
R(b
ase
enco
ded
at b
itrat
e of
enh
)
PSNR Difference plots
Base layer bitrate: 200k
Base layer bitrate: 400k
Base layer bitrate: 600kBase layer bitrate: 800k
Base layer bitrate: 1000k
Plots show PSNR improvement of using Enhancement layer video over that of video re-encoded at same rate, with respect to the enhancement layer bitrate
UCLA ONR 9
Enhancement layer region of interest
• Application of enhancement layer coding concept to region of interest coding.
• High quality information region of interest may be sent when bandwidth becomes available to certain network connections.
UCLA ONR 10
Image representation using reduced energy processing
• Use edge information to convey scene and location context
• Provides significant scene information while dramatically reducing energy consumption and memory utilization
• Simple, efficient compression algorithms can be applied
original image edge detected image
UCLA ONR 11
Generalized Gaussian Source
Generalized Gaussian Source pdf:
where C1, C2, and are given by
is the shape parameter
= 2 Gaussian
= 1 Laplacian
vX xCCxf 21 exp)(
)/1(2
),(1 v
vvC
vvC ),(2 2/1
)/1(
)/3(1),(
v
vv
UCLA ONR 12
Generalized Gaussian Source
• Choosing the “shape parameter” ν allows representation of a wide range of pdfs
-5 -4 -3 -2 -1 0 1 2 3 4 50
0.5
1
1.5
2
2.5
3
x
fX(x
)PDF of Generalized Gaussian
v=1
v=2v=0.5
UCLA ONR 13
Golomb Rice (GR) and exponential Golomb (EG) Codes
• GR and EG codes are classes of Huffman codes that have highly regular structure
• There is no need for an explicit codebook; the codebook is implicit in the choice of the code
• EG codes are particularly well suited to coding of image data that has been processed and then quantized or thresholded.
UCLA ONR 14
Structure of Golomb Rice Codes
• Code Trees
k=1 k=2 k=3
0 1 0 1 1 1 2 01 03 01 14 001 05 001 16 0001 07 0001 18 00001 09 00001 1
0 1 00 1 1 01 2 1 103 1 114 01 005 01 016 01 107 01 118 001 009 001 01
0 1 000 1 1 001 2 1 0103 1 0114 1 1005 1 1016 1 1107 1 1118 01 0009 01 001
Index Codeword Index Codeword Index Codeword
Prefix:# of zeros describe “depth” in tree Suffix: describes which branch for a particular depth
UCLA ONR 15
Structure of Exponential Golomb Codes
Code Trees
k = 0k=1
k=2
0 1 1 01 02 01 13 001 004 001 015 001 106 001 01 7 0001 0008 0001 0019 0001 010
Index Codeword0 01 01 01 12 001 003 001 014 001 105 001 01 6 0001 0007 0001 0018 0001 0109 0001 011
Index Codeword
0 001 001 001 012 001 103 001 01 4 0001 0005 0001 0016 0001 0107 0001 0118 0001 1009 0001 101
Index Codeword
UCLA ONR 16
Efficiency of Golomb-Rice and exp-Golomb codes for positive discrete sources derived from a GG, showing effect of code choice and code parameter k. Figure a shows case for v=.3. Curves for v=.7 are given in Figure b. The ratio δ/σ conveys the effect if samples from a generalized Gaussian source with standard deviation σ is quantized using step size δ.
Figure a, v=0.7 Figure b, v=.3
Coding Efficiency for Generalized Gaussian Source
UCLA ONR 17
Histogram of Data from Edge Detected Image
Coding efficiency = 92%
UCLA ONR 18
Complexity of Edge + EG coding
• Complexity of edge detection combined with EG coding is significantly less than video coding using DCT and motion compensation
• Edge Detection Algorithm– 4 shifts, 11 adds, 2 abs() per pixel
• Exp Golomb Code– Codeword can be easily generated using state machine
– Number of operations depends on the run-length statistics, i.e. for each additional bit used in the prefix, require an additional 2 shifts, 2 additions.
– In the typical images we have observed, the coding takes under 0.2 add and 0.2 shifts per pixel
• Total– Dominated by edge processing; e.g. 4 shifts, 11 adds, 2 abs() per pixel
UCLA ONR 19
Complexity of Video coding
• The two most computationally expensive steps of video coding are DCT and motional compensation calculations
• DCT– For each 8x8 DCT block, well optimized implementation uses 536
additions, 192 multiplies per block
– For 640x480, there are 80x60 = 4,800 blocks
– Total of 2,572,800 additions, 2,572,800 multiplies per frame
– Average of 8.4 additions, 3 multiplies per pixel, plus associated memory fetches/stores
UCLA ONR 20
Complexity Analysis of Video Coding
• Motion Search– For each MxM block, and offset of L, need
• (L+1)^2 total number of offsets• 2xMxM addition operations per offset• MxM memory fetches per offset
– E.g. 16x16 block with offset of 16• 1089 x 2 x 16 x 16 = 557,368 additions• 1089 x 16 x 16 = 278,784 memory fetches
– For 640x480, there are 1,200 blocks– Total of 668,841,600 additions per frame– Average of 2,177 additions, 1089 memory fetches per pixel– Reducing the search range to an offset of 8 would result in 578
additions, 289 memory fetches per pixel.
• Entropy Coder– Complexity of H.263’s entropy coder depends on specific image, etc..– In the case of the use of Huffman coding with DCT coefficients, requires
additional memory overhead for lookup tables.– H.264 Main Profile (CABAC) would involve significant extra costs due to
arithmetic coding
UCLA ONR 21
Comparison
• The energy cost for multiplies relative to adds varies with processor, implementation, etc., but an order of magnitude is a reasonable approximation
• Edge+EG required 11 adds/pixel (disregard shifts and abs() as these are nearly free)
• Video encoding required between approximately 500 and 2000 adds/pixel, dominated by motion compensation
• Difference of approximately 2-3 orders of magnitude in energy cost
• Can reduce video coding burden using sub-optimal fast motion searches, etc., but reductions will still leave edge+EG at an energy advantage of several orders of magnitude
UCLA ONR 22
Hybrid system representing object edges, texture in ROI
• Energy-reducing benefits of edge-based representation
• Image quality benefits of traditional video coding in a region of interest
• Example: Total frame size 640 by 480
• ROI: 200 by 160
• Reduces overall energy consumption by approximately an order of magnitude
UCLA ONR 23
Secure arithmetic coding
• Used in many coding algorithms JPEG2000 (still image) and H.264 (video)
• Associates a sequence of symbols with a position in the range [0,1)• Enables high coding efficiency• Not secure as traditionally implemented• Recursive partitioning is used prior to encoding of each new symbol• Every position on [0,1) is associated with a unique symbol string
(binary example shown below)
A B
AA AB BA BB
AAA ABBAAB ABA BABBAA BBA
BBB
0 1
One symbol
Two symbols
Three symbols
UCLA ONR 24
Interval splitting with one symbol
• Key k0 identifies where the interval is to be split• The portion of the A interval to the right of the key is moved to the right of the B
interval• B is unchanged. A representation in the interval B will have the same codeword
length as in traditional AC• A has two subintervals: at most one more bit is needed relative to traditional AC,
though in some cases there is no increase in length
The result of interval splitting with 1 symbol
UCLA ONR 25
Interval ordering diversity
key = [0.45 0.23]
key = [0.85 0.40]
key = [0.25 0.78]
UCLA ONR 26
End-to-end quality optimization: High level block diagram
Transmitter
Receiver
Image input Image coder Channel Coder Transmission
Timing Recovery Channel Decoder Image decoder
UCLA ONR 27
Timing Recovery
• Symbol timing errors occur at receivers due to clock differences, Doppler effects etc.
• Traditional method: use simple PLL-based circuits• Our approach: data-aided iterative timing recovery
– use information from LDPC decoder at each iteration to assist synchronization
UCLA ONR 28
Block Diagram
Timing Recovery Loop
UCLA ONR 29
Bit Error Rate
0.5 1 1.5 2 2.5
10-4
10-3
10-2
10-1
BE
R
Eb/N
o
Iterations = 15
without LDPC feedback
with LDPC feedback
UCLA ONR 30
Timing Recovery: Demonstration
• AWGN channel at EbNo=2dB and random phase offsets
Without LDPC FeedbackBER = 10-0.7
With LDPC FeedbackBER = 10-1.5
UCLA ONR 31
Conclusions
• Traditional assumption that “efficient” representation of imagery means maximizing compression needs to be re-examined. Maximizing energy efficiency can be more important than maximizing compression efficiency
• Need methods to convey scene content that reflect power, bandwidth, memory and transmission reliability characteristics, limitations, and statistics of a given platform and environment
• Alternative scene/object representation methods, specifically aimed at low energy with no attempt at esthetic quality, hold promise. Contrast with (mostly failed) previous attempts at “object-based” coding
UCLA ONR 32
Conclusions (continued)
• Low power imaging sensors and networks of such sensors likely to be critical
• Local processing critical, realistic collaboration also has potential
• Low power imaging event detection strategies needed; can’t simply be doing high energy image processing continuously while waiting for events which may occur rarely
• Additional challenge in appropriately determining which information to convey, when, to whom, and how to convey it
• Need proper balance of autonomous and human management of imaging networks, proper balance of video vs. still imagery, resolution vs. rate, etc.
• Approx $20K forecast to remain as of Oct 05