Seminar presented by: Tomer Faktor
Advanced Topics in Computer Vision (048921)
12/01/2012
SINGLE IMAGE SUPER RESOLUTION
2
OUTLINE
• What is Image Super Resolution (SR)?
• Prior Art
• Today – Single Image SR:
Using patch recurrence [Glasner et al., ICCV09]
Using sparse representations [Yang et al., CVPR08],
[Zeyde et al., LNCS10]
4
• Inverse problem – underdetermined
• Image SR - how to make it determined:
WHAT IS IMAGE SR?
How Type
Set of low res. images Multi-Image
External database of high-low res. pairs of image patches
Example-Based
Image model/prior Single-Image
5
• Always:
More pixels - according to the scale factor
Low res. edges and details - maintained
• Bonus:
Blurred edges - sharper
New high res. details missing in the low res.
GOALS OF IMAGE SR
7
Example-based SR (Image
Hallucination)Algorithm
External database of low and high res.
image patches
[Freeman01, Kim08]
EXAMPLE-BASED SR
Blur kernel & scale factor
9
• No blur, only down-sampling Interpolation:
LSI schemes – NN, bilinear, bicubic, etc.
Spatially adaptive and non-linear filters
[Zhang08, Mallat10]
• Blur + down-sampling:
Interpolation Deblurring
PRIOR ART
10
• LS problem:
• Add a regularization:
Tikhonov regularization, robust statistics, TV, etc.
Sparsity of global transform coefficients
Parametric learned edge model [Fattal07, Sun08]
PRIOR ART
2ˆ argmin l
yy SHy z
BASIC ASSUMPTION
Patches in a single image tend to redundantly recur many
times within and across scales
12
14
• Patch recurrence A single image is enough for SR
• Adapt ideas from classical and example-based SR:
SR linear constraints – in-scale patch
redundancy instead of multi-image
Correspondence between low-high res.
patches - cross-scale redundancy instead
of external database
SUGGESTED APPROACH
Unified framework
EXPLOITING IN-SCALE PATCH REDUNDANCY
15
• In the low res. – find K NN for each pixel
• Compute their subpixel alignment
• Different weight for each linear SR constraint (patch similarity)
16
EXPLOITING CROSS-SCALE REDUNDANCY
1
0
1
k
k
I H
I
I L
I
I
Unknown, increasing res. by factor α
Known, decreasing res. by factor α
Copy
ParentNN
NN Parent
Copy
18
• Coarse-to-fine – gradual increase in resolution, for
numerical stability, improves results
• Back-projection [Irani91] – to ensure consistency of
the recovered high res. image with the low res
• Color images – convert RGB to YIQ:
For Y – the suggested approach
For I,Q – bicubic interpolation
IMPORTANT IMPLEMENTATION DETAILS
RESULTS – PURELY REPETITIVE STRUCTURE
21
High res. detail
Input low res. image
Bicubic interp .
Only in-scale
redundancy
In and cross-scale
redundancies
RESULTS – TEXT IMAGE
22
Input low res. image
3
Bicubic Interp.Suggested approachGround Truth
Small digits recur only in-scale and
cannot be recovered
Bicubic Interp.Edge model [Fattal07]
RESULTS – NATURAL IMAGE
23
Input low res. image
4
Example-based [Kim08]Suggested Approach
24
Reasonable assumption – validated empirically
Very novel – new SR framework, combining two widely
used approaches in the SR field
Solution technically sound
Well written, very nice paper!
Many visual results (nice webpage)
PAPER EVALUATION
25
Not fully self-contained – almost no details on:
Subpixel alignment
Weighted classical SR constraints
Back-projection
No numerical evaluation (PSNR/SSIM) of the results
No code available
No details on running time
PAPER EVALUATION
R. Zeyde, M. Elad and M. Protter
Curves & Surfaces, 2010
ON SINGLE IMAGE SCALE-UP USING
SPARSE-REPRESENTATIONS
27
SPARSE-LAND PRIOR
d
m
ADictionary
n
sparse vector
q signal
n y
Non-zeros
• Widely-used signal
model with numerous
applications
• Efficient algorithms
for pursuit and
dictionary learning
28
BASIC ASSUMPTIONS
• Sparse-land prior for image patches
• Patches in a high-low res. pair have the same sparse
representation over a high-low res. dictionary pair
k kh hp A q k k
l lp A q
Same procedure for all patches
Joint training!
[Yang08]&
[Zeyde10]
29
JOINT REPRESENTATION - JUSTIFICATION
• Each low res. patch is generated from the high res. using
the same LSI operator (matrix):
• A pair of high-low res. dictionaries with the correct
correspondence leads to joint representation of the patches:
, k kl h k p Lp
k k k k kh h h l h h l l p A q v p LA q Lv A q v
30
RESEARCH QUESTIONS
• Pre-processing?
• How to train the dictionary pair?
• What is the training set?
• How to utilize the dictionary pair for SR?
• Post-processing?
[Yang08]&
[Zeyde10]
31
PRE-PROCESSING – LOW RES.
Feature Extraction by HPFs
-1-0.500.5
-101
Bicubic Interpolation
1
2
Low Res. Image
32
PRE-PROCESSING – LOW RES.
Dimensionality Reduction via PCAProjection
onto reduced PCA basis
3
Features for low res. patches
kl kpln
35
TRAINING THE LOW RES. DICTIONARY
kl kp
lA
kk
q
1
K-SVD Dictionary Learning
2
0, 2
, argmin p s.t. 3k
lk
k k k kl l lk
k
A q
A q A q q
37
WHICH TRAINING SET?
• External set of true high res. image (off-line training):
Generate low res. image
Collect
• The image itself (bootstrapping):
Input low res. image = “high res.”
Proceed as before…
,k kh l k
p p
SH
38
RECONSTRUCTION PHASE
Pre-processingLow Res. Image
OMP
ly
kl kp k
kq
lA Image Reconstruction
hA
SR Image
39
IMAGE RECONSTRUCTION
No need to perform back-projection!
ˆ , k kh h k p A q
1
ˆˆ T T kh l k k k h
k k
y y R R R p
40
RESULTS – TEXT IMAGE
Input low res. image
3
Bicubic Interp .
PSNR=14.68dBPSNR=16.95dB
Suggested Approach Ground Truth
Dictionary pair learned off-line from another text image!
41
NUMERICAL RESULTS – NATURAL IMAGES
Suggested Alg. Yang et al. Bicubic Interp.SSIM PSNR SSIM PSNR SSIM PSNR
0.78 26.77 0.76 26.39 0.75 26.24 Barbara
0.66 27.12 0.64 27.02 0.61 26.55 Coastguard
0.82 33.52 0.80 33.11 0.80 32.82 Face
0.93 33.19 0.91 32.04 0.91 31.18 Foreman
0.88 33.00 0.86 32.64 0.86 31.68 Lenna
0.79 27.91 0.77 27.76 0.75 27.00 Man
0.94 31.12 0.93 30.71 0.92 29.43 Monarch
0.89 34.05 0.87 33.33 0.87 32.39 Pepper
0.91 25.22 0.89 24.98 0.87 23.71 PPT3
0.84 28.52 0.83 27.95 0.79 26.63 Zebra
0.85 30.04 0.83 29.59 0.81 28.76 Average
Dictionary pair learned off-line from a set of training images!
1
3 3
Bicubic Interp. Yang et al.Ground TruthSuggested Approach
42
VISUAL RESULTS – NATURAL IMAGES
Artifact
43
COMPARING THE TWO APPROACHES
Zeyde et al. Glasner et al.
Single Image SR
Sparse land prior, joint representationfor high-low res.
Patch recurrence Basic assumptions
Overlaps Subpixel alignment Multi-patch constraints
Learned dictionary-pair
Across-scale redundancy
High-low res. correspondence
Image pre-processing
Coarse-to-fine
Image post-processing
44
Input low res. image
3
Bicubic Interp.Suggested Off-line ApproachSuggested On-line ApproachGlasner et al.VISUAL COMPARISON
Appears inZeyde et al .
45
Input low res. image
3
Bicubic Interp.Suggested Off-line ApproachSuggested On-line ApproachGlasner et al.
Doesn’t appear inZeyde et al .
46
Reasonable assumptions – justified analytically
Novel – similar model as Yang et al., new algorithmic
framework, improves runtime and image SR quality
Solution technically sound
Well written and self-contained
Code available
Performance evaluation – both visual and numerical
PAPER EVALUATION
47
Experimental validation – not full:
Comparison to other approaches – main focus on bicubic
interp. and Yang et al.
No comparison between on-line and off-line learning of
the dictionary pair on the same image
Only “good” results are shown, running the code
reveals weakness with respect to Glasner et al.
PAPER EVALUATION
48
• Extending the first approach to video: Space-time SR from
a single video [Shahar et al., CVPR11]
• Merging patch redundancy + sparse representations in the
spirit of non-local K-SVD [Mairal et al., ICCV09]
• Coarse-to-fine also for the second approach – to improve
numerical stability and SR quality
FUTURE DIRECTIONS