Ping Tan
Simon Fraser University
Photos vs. VideosPhotos vs. Videos (live photos)(live photos)• A good photo tells a story
• Stories are better told in videos
• More videos are captured by mobile devices– Sales of compact cameras have fallen
• 300 hrs of videos are uploaded to YouTube every minute– Cisco predicts videos account for 70% of internet traffic by 2017
Videos in the Mobile EraVideos in the Mobile Era (mobile & share)(mobile & share)
• how to produce professional videos with mobile devices?
• how to create exciting content?
Challenges in the Mobile EraChallenges in the Mobile Era
the SynthCam app by Marc Levoy [Yu and Gallup 2014]
Mobile:
Share:
Stabilization
Computational VideographyComputational Videography
Video defog and stereo
TrackCam
Enhance Video Quality
Enable Advanced Photography
Auto Fence Removal
Camera Motion Estimation
Camera Motion Estimation
Re-rendering(by Image Warping)
Re-rendering(by Image Warping)
t
t + 1
Feature Tracking Smoothed Camera Path
Camera Path Smoothing
Camera Path Smoothing
Camera Path
t + 1
t
Pipeline of Video StabilizationPipeline of Video Stabilization
• 2D method[Matsushita et al. PAMI 2006; Grundmann et al. CVPR 2011]
• 3D method[Liu et al. SIGGRAPH 2009; Liu et al. CVPR 2012; Zhou et al. CVPR 2013]
• 2.5D method[Liu et al. TOG 2011; Goldstein and Fattal, TOG 2012]
Relevant rolling shutter correction techniques[Baker et al. CVPR 2010; Karpenko et al. Stanford Tech Report. 2011; Grundmann et al. ICCP 2012]
iMovie, Apple
YouTube Stabilizer, Google
After Effect CS6, Adobe
Movie Maker, Microsoft
Video stabilization techniques can be categorized as:
Digital Video StabilizationDigital Video StabilizationPopular commercial solutions:
1. Large depth variation
Challenges in Consumer VideosChallenges in Consumer Videos
1. Large depth variation
2. Quick camera motion (rotation, zooming)
Challenges in Consumer VideosChallenges in Consumer Videos
1. Large depth variation
2. Quick camera motion (rotation, zooming)
3. Large moving objects
Challenges in Consumer VideosChallenges in Consumer Videos
1. Large depth variation
2. Quick camera motion (rotation, zooming)
3. Large moving objects
4. Strong rolling shutter effects
Challenges in Consumer VideosChallenges in Consumer Videos
input
previous method(virtual dub stabilizer)
1. Not stable enough
Common Artifacts in Stabilized VideosCommon Artifacts in Stabilized Videos
1. Not stable enough
2. Geometry distortion
input
previous method (YouTube)
Common Artifacts in Stabilized VideosCommon Artifacts in Stabilized Videos
1. Not stable enough
2. Geometry distortion
3. Cropping
input
previous method (Adobe After Effects)
Common Artifacts in Stabilized VideosCommon Artifacts in Stabilized Videos
Camera Motion Estimation
Camera Motion Estimation
Camera Path Smoothing
Camera Path Smoothing
• Motion model & estimation
• Adaptive path smoothing
To address the challenges in consumer videos:
1. Large depth variation
2. Quick camera motion
3. Large moving foreground
4. Rolling shutter
Re-rendering(by Image Warping)
Re-rendering(by Image Warping)
Our GoalsOur Goals
By our novel techniques in:
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “SteayFlow: Spatially Smooth Optical Flow for Video Stabilization”. IEEE CVPR 2014
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “Bundled camera paths for video stabilization”. ACM SIGGRAPH 2013
• Shuaicheng Liu, Yinting Wang, Lu Yuan, Jiajun Bu, Ping Tan, Jian SunVideo Stabilization with a depth camera. IEEE CVPR 2012
ContributionsContributions
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “Bundled camera paths for video stabilization”. ACM SIGGRAPH 2013
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “SteayFlow: Spatially Smooth Optical Flow for Video Stabilization”. IEEE CVPR 2014
• Shuaicheng Liu, Yinting Wang, Lu Yuan, Jiajun Bu, Ping Tan, Jian SunVideo Stabilization with a depth camera. IEEE CVPR 2012
Camera Motion Estimation
Camera Motion Estimation
Camera Path Smoothing
Camera Path Smoothing
ContributionsContributionsRe-rendering
(by Image Warping)
Re-rendering(by Image Warping)
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “SteayFlow: Spatially Smooth Optical Flow for Video Stabilization”. IEEE CVPR 2014
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “Bundled camera paths for video stabilization”. ACM SIGGRAPH 2013
• Shuaicheng Liu, Yinting Wang, Lu Yuan, Jiajun Bu, Ping Tan, Jian SunVideo Stabilization with a depth camera. IEEE CVPR 2012
Camera Motion Estimation
Camera Motion Estimation
Camera Path Smoothing
Camera Path Smoothing
ContributionsContributions
Re-rendering(by Image Warping)
Re-rendering(by Image Warping)
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “SteayFlow: Spatially Smooth Optical Flow for Video Stabilization”. IEEE CVPR 2014
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “Bundled camera paths for video stabilization”. ACM SIGGRAPH 2013
• Shuaicheng Liu, Yinting Wang, Lu Yuan, Jiajun Bu, Ping Tan, Jian SunVideo Stabilization with a depth camera. IEEE CVPR 2012
ContributionsContributions
Depends on fragile 3D reconstruction
Spatially-variant motion
3D method:
camera motion estimationcamera motion estimation
Time-consuming
[Liu et al. SIGGRAPH 2009]
structure from motion
Camera Motion Estimation
Camera Motion Estimation
Camera Path Smoothing
Camera Path Smoothing
Re-rendering(by Image Warping)
Re-rendering(by Image Warping)
camera motion estimationcamera motion estimation
Robust
Homogenous planar motion
Efficient
1
[Matsushita et al. PAMI 2006; Grundmann et al. CVPR 2011]
Camera Motion Estimation
Camera Motion Estimation
Camera Path Smoothing
Camera Path Smoothing
Re-rendering(by Image Warping)
Re-rendering(by Image Warping)
2D method:
2.5D method:
camera motion estimationcamera motion estimation
3D reconstruction → 2D feature tracking
Camera Motion Estimation
Camera Motion Estimation
Camera Path Smoothing
Camera Path Smoothing
tracking when rotating
Re-rendering(by Image Warping)
Re-rendering(by Image Warping)
[Liu et al. TOG 2011; Goldstein and Fattal, TOG 2012]
Feature tracking is fragile to quick camera motion
Spatially-variant motion
camera motion estimationcamera motion estimation
Novel flexible 2D motion model
spatially-variant motion
only 2 frames feature correspondence
Our Solution:
Camera Motion Estimation
Camera Motion Estimation
Camera Path Smoothing
Camera Path Smoothing
Re-rendering(by Image Warping)
Re-rendering(by Image Warping)
our mesh-based motion modelour mesh-based motion model
our mesh-based motion modelconventional single homography
• Divide the video frame to a 2D regular grid mesh
• Divide the video frame to a 2D regular grid mesh
• Estimate a homography in each cell (now, spatial-variant motion)
our mesh-based motion modelour mesh-based motion model
, , , ,
, , , ,
, , , ,
, , , ,
our mesh-based motion modelconventional single homography
our mesh-based motion modelour mesh-based motion model
frame t
, , , ,
, , , ,
, , , ,
, , , ,
frame t+1, warped from frame tTwo challenges:
• Maintain continuity in motion estimation
• Estimate motion at textureless cells (e.g. in sky)
our mesh-based motion modelour mesh-based motion model
frame t frame t + 1
Our solution:
• Parameterize by the translations at mesh grid points
• Estimate all translations by an as-similar-as-possible warping
• Estimate at each cell from , , , , , ,
[Igarashi et al. 2005; Liu et al. SIGGRAPH 2009]
Data term: ⟷ should the same local bilinear coordinates.
model estimationmodel estimation
∑ || ||2,
frame t
frame t+1
where ∑ || ||2.
model estimationmodel estimation
Smooth term: should be close to a similarity
00
comparison with global homographycomparison with global homography
Single homography Our method[Matsushita et al. PAMI 2006]
frame t frame t+1
comparison with global homographycomparison with global homography
single homography
mesh-based homography
frame index
error =
0 20 40 60 80 100 120 140 160 1800
0.05
0.1
0.15
0.2
0.25
0.3
0.35
error
comparison with global homographycomparison with global homography
Stabilized with a global homography Stabilized with our method
Homography array
comparison to [Grundmann et al. ICCP 2012]comparison to [Grundmann et al. ICCP 2012]
Our method
Gaussian smoothness
[Grundmann et al. ICCP 2012]
frame t frame t+1
comparison to [Grundmann et al. ICCP 2012]comparison to [Grundmann et al. ICCP 2012]
Homography array
our method
error =
0 50 100 150 200 2500
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
frame index
error
comparison to [Grundmann et al. ICCP 2012]comparison to [Grundmann et al. ICCP 2012]
Stabilized by Youtube.com Stabilized with our method
• Low-pass filtering
• Polynomial curves
• Piece-wise smoothing
• L1-norm optimization
camera path smoothingcamera path smoothing
[Grundmann et al. ICCV 2011]
[Morimoto and Chellappa, ICASSP 1999, Matsushita et al. PAMI 2006]
[Chen et al. CG Forum 2008]
[Gleicher and Liu, Multimedia 2007]
Camera Motion Estimation
Camera Motion Estimation
Camera Path Smoothing
Camera Path Smoothing
Re-rendering(by Image Warping)
Re-rendering(by Image Warping)
camera pathcamera path
t t + 1
homographies → camera path
t + 2
bundled camera pathsbundled camera paths
1 2
3 4
adaptive smoothingadaptive smoothing
low-pass smoothing our adaptive smoothing
jittersrapid panning
distortion
input camera path
• Data term
• Smoothness ∑ ,∈ ⋅
smooth a single pathsmooth a single path
, ⋅
temporal range
temporalra
nge
close to original path
bilateral weight
Adaptive smoothing by minimizing:
• Iteratively optimized by (according to the Jacobi iterative solver)
• Initialized as
smooth a single pathsmooth a single path
1,
1 ,
Adaptive smoothing by minimizing:
smooth bundled pathssmooth bundled paths
+ 2
∈
1
1
local adaptive path smoothing spatial smoothness
re-renderingre-rendering
,
frame tframe t-1
. .
. .
… …Input video
… …frame t-1 frame t
Stabilized video
Input video
Stabilized video
Video ResultsVideo Results
Video ResultsVideo Results
+ 2
∈
spatial smoothness
with spatial constraintwithout spatial constraint
low-pass local path smoothing
+ 2
∈
local path smoothing
adaptive local path smoothing
• CPU: Intel i7 3.2GHz Quad-Core, RAM: 8G
• 400 ~ 600 SURF features / frame
• 720P video (resolution: 1280 X 720): 392 ms / frame (~2.5 fps)
Computational EfficiencyComputational Efficiency
extract feature(300 ms)
estimate motion(50 ms)
render frame(30 ms)smooth paths
(12 ms)
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “SteayFlow: Spatially Smooth Optical Flow for Video Stabilization”. IEEE CVPR 2014
• Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. “Bundled camera paths for video stabilization”. ACM SIGGRAPH 2013
• Shuaicheng Liu, Yinting Wang, Lu Yuan, Jiajun Bu, Ping Tan, Jian SunVideo Stabilization with a depth camera. IEEE CVPR 2012
ContributionsContributions
Camera Motion Estimation
Camera Motion Estimation
Camera Path Smoothing
Camera Path Smoothing
• The ‘bundled path’ method prefers smaller grid size
• What if we use 1x1 grid size?
• How to smooth the flow fields?
Video Stabilization PipelineVideo Stabilization Pipeline
Re-rendering(by Image Warping)
Re-rendering(by Image Warping)
optical flow based motion model
• Obtain feature trajectories from optical flow
• Smooth feature trajectories
A Naïve MethodA Naïve Method
Sub-space constraint between trajectories [Liu et al. 2011]
Trajectories have irregular shape (which complicates smoothing)
a feature trajectory
A pixel profile: motion vectors at the same pixel location over time.
Feature Trajectories vs Pixel ProfilesFeature Trajectories vs Pixel Profiles
a feature trajectory a pixel profile
Different profiles can be smoothed independently
Pixel profiles are regular (which simplifies smoothing)
“SteayFlow: Spatially Smooth Optical Flow for Video Stabilization”. Shuaicheng Liu, Lu Yuan, Ping Tan, Jian Sun. IEEE CVPR 2014
|feature trajectory pixel profile|
Scene without motion discontinuity
Statistics of Pixel ProfilesStatistics of Pixel Profiles
Statistics of Pixel ProfilesStatistics of Pixel ProfilesScene with motion discontinuity
|feature trajectory pixel profile|
input frame optical flow
(see paper for more details)motion completionsteady-flow
Inpaint Discontinuous MotionsInpaint Discontinuous Motions
Smoothing Pixel ProfilesSmoothing Pixel ProfilesSmooth each pixel profile individually by minimizing:
temporal
rang
eclose to original path bilateral weight
Video ResultsVideo Results
Stabilization
Computational VideographyComputational Videography
Video defog and stereo
TrackCam
Enhance Video Quality
Enable Advanced Photography
Auto Fence Removal
What is Tracking Shots?What is Tracking Shots?
How to Take Tracking Shots?How to Take Tracking Shots?
Our Solution for Tracking ShotsOur Solution for Tracking Shots
3D Method3D Method
Object segmentation
…
…
3D scene reconstruction
⨁3D foreground motion
⨁Blur kernels
ResultInput video
3D Method3D Method
Object segmentation
…
…
3D scene reconstruction
⨁3D foreground motion
⨁Blur kernels
ResultInput video
Adobe After Effects RotoBrush
3D Method3D Method
Object segmentation
…
…
3D scene reconstruction
⨁3D foreground motion
⨁Blur kernels
ResultInput video
3D Method3D Method
Object segmentation
…
…
3D scene reconstruction
⨁3D foreground motion
⨁Blur kernels
ResultInput video
3D Method3D Method
Object segmentation
…
…
3D scene reconstruction
⨁3D foreground motion
⨁Blur kernels
ResultInput video
virtual cameras
foreground motion trajectory
3D Method3D Method
Object segmentation
…
…
3D scene reconstruction
⨁3D foreground motion
⨁Blur kernels
ResultInput video
3D Method3D Method
Object segmentation
…
…
3D scene reconstruction
⨁3D foreground motion
⨁Blur kernels
ResultInput video
3D Method3D Method
Object segmentation
…
…
3D scene reconstruction
⨁3D foreground motion
⨁Blur kernels
ResultInput video
Main challenge: recover 3D trajectory of the moving foreground
trajectory triangulationtrajectory triangulation
static 3D point
?
dynamic 3D point
Challenge: a static point a moving point
Input:– Camera pose (computed according to the static background)
– A 2D position of foreground object at each frame
Output:– A 3D position of the foreground object
trajectory triangulationtrajectory triangulation
the 3D point is projected to by the camera
algebraic error constraintalgebraic error constraint
0
, : , 0
, 1
E ∑ , : ,
all 3D point roughly form a line
linear motion constraintlinear motion constraint
0
∗ 0
E ∑ ∑ ∗
is a 6D vector, Plucker line representation
∗ is a 4 4 matrix, by rearranging elements in [Avidan & Shashua 2000]
the foreground object has near constant velocity/acceleration
constant velocity/acceleration constraintconstant velocity/acceleration constraint
2 2
E ∑ 2 ∑ 3 3
the foreground’s apparent size is proportional to its inverse depth
perspective constraintperspective constraint
, : , 1/ : 1/S
E ∑
, is ’s depth is the foreground’s pixel counts
,[Hartley & Zisserman 2003]
is the 3rd row of
Energy minimization:,
Iteratively estimate ,
final formulationfinal formulation
applied to overlapping sub-sequences
3D results3D results
3D results3D results
trajectory evaluationtrajectory evaluation
without and with perspective constraint
without and with constant velocity
without and with linear motion
Pseudo 3D MethodPseudo 3D Method
• 3D scene reconstruction hallucinate background 3D
• 3D foreground motion hallucinate foreground 3D
Object segmentation
…
…
3D scene reconstruction
⨁3D foreground motion
⨁Blur kernels
ResultInput video
hallucinate background 3Dhallucinate background 3D
• Principles: faraway points have smaller disparity
• Algorithm:– 1. remove camera rotation by stabilization
– 2. turn feature disparity to depth directly
2 /
hallucinate foreground 3Dhallucinate foreground 3D
• Principles: faraway object appears smaller
• Algorithm: turn object size to depth directly
γ/S
merge foreground and backgroundmerge foreground and background
Hallucinated foreground & background are in different scales.
We fix and adjust interactively.
2 / γ/S
pseudo 3D examplespseudo 3D examples
pseudo 3D examplespseudo 3D examples
EvaluationEvaluation• synthetic examples by Maya
• existing commercial tools
• a manual tool by user study
synthetic examplessynthetic examples
Tracking camera
Hand-held camera
rendered in Maya
synthetic examplessynthetic examples
Hand-held Cam Tracking Cam
Ground-truth
3D method (PSNR=33.36)
Pseudo 3D method(PSNR= 32.28)
synthetic examplessynthetic examples
Hand-held Cam Tracking Cam
Ground-truth
3D method (PSNR=34.12)
Pseudo 3D method(PSNR= 31.29)
synthetic examplessynthetic examples
Hand-held Cam Tracking Cam
Ground-truth
3D method(PSNR = 31.55)
Pseudo 3D method(PSNR = 29.48)
photoshop blur galleryphotoshop blur gallery
photoshop blur galleryphotoshop blur gallery
photoshop blur galleryphotoshop blur gallery
the Analog Efex 2the Analog Efex 2
our manual toolour manual tool3x fast
User studyUser study• 3 subjects, each create 20 tracking shots
– A and B use our manual tool
– C create use our automatic tool (10 by 3D, 10 by pseudo 3D)
• 30 viewers: judge the quality
Created by A Created by B Created by C
User studyUser study• 3 subjects, each create 20 tracking shots
– A and B use our manual tool
– C create use our automatic tool (10 by 3D, 10 by pseudo 3D)
• 30 viewers: judge the qualitySubject A Subject B
3D method 61.8% 90.6%
Pseudo 3D method 67.7% 91.2%
The numbers are the percentages of viewers who favored our results
More ResultsMore Results
Stabilization
SummarySummary
Video defog and stereo
TrackCam
Enhance Video Quality
Enable Advanced Photography
Auto Fence Removal