TWO-STREA MCON VOLUTIONA LNETWORKS FO … · TWO-STREA MCON VOLUTIONA LNETWORKS FO RDYNAMICTEXTUR...

[1] Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. Texture synthesis using convolutional neural networks. NIPS 2015.[2] Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, and Victor S. Lempitsky. Texture networks: Feed-forward synthesis of textures and stylized images. ICML 2016.[3] Konstantinos G. Derpanis and Richard P. Wildes. Spacetime texture representation and recognition based on a spatiotemporal orientation analysis. PAMI 2012.[4] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.

fireplace_1(original)

fireplace_1(synthesized)

fish(original)

appearancestream only

both streams

Dynamic Texture Synthesis Dynamics Style Transfer

synthesized output

dynamics targetappearancetarget

Dynamics Stream network trained for optical flow prediction inan appearance-invariant manner. Models the dynamics of theinput dynamic texture.

Dynamics Stream

encode

encode

encode

input(Nx2xHxWx1)

contrastnorm

conv(3x3)

64 filters

conv(1x1)2 filters

ReLU flow

decode

aEPE

targetflow

channelconcat

(x2)

(x4)

conv(2x11x11)32 filters

rectifymax

stride 1

pool(5x5)

conv(1x1)

64 filters

L1norm

encode

(x½)

(x½)

(VGG19 [4])

Overall Architecture

Iteratively coerce an initial Gaussian noise sequence such thatits spatiotemporal statistics from each stream match those of aninputs dynamic texture. This is done by optimizing (3) w.r.t. thespacetime volume (initially noise).

(3)

(2) where is the number ofConvNet layers being used in the

appearance stream, models the synthesized texture dynamics, and modelsthe target texture dynamics averaged across time.

(1) where is the number of ConvNetlayers being used in the dynamics

stream, is the number of generated frames, is the Frobenius norm, isthe Gram matrix that models the synthesized texture appearance, and modelsthe target texture appearance averaged across time.

1. Motivated by the recent successes in texture synthesis usingConvNets [1, 2], we present a novel, two-stream model ofdynamic texture synthesis to capture both appearanceand dynamics.

2. A novel network architecture (which is motivated by thespacetime oriented energy model of [3]) designed to computeoptical flow in an appearance-invariant manner, servingas the dynamics stream of our dynamic texture synthesismodel.

3. A two-stream model that enables dynamics styletransfer, where the appearance and dynamics from differentsources can be combined to generate a novel texture.

We introduce a two-stream model for dynamic texture synthesis based on pre-trained convolutional networks (ConvNets) that targettwo independent tasks: object recognition and optical flow prediction. Given an input dynamic texture, the object recognitionConvNet models the per-frame appearance of the input texture, while the optical flow ConvNet models its dynamics. To generatea novel texture, a noise sequence is optimized to match the feature statistics from each stream of the input texture.

TWO-STREAM CONVOLUTIONAL NETWORKSFOR DYNAMIC TEXTURE SYNTHESISMatthewTesfaldet, MarcusA.Brubaker - York University; Konstantinos G. Derpanis - Ryerson University{mtesfald, mab}@eecs.yorku.ca, [email protected]

Date post:	03-Aug-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

TWO-STREA MCON VOLUTIONA LNETWORKS FO … · TWO-STREA MCON VOLUTIONA LNETWORKS FO RDYNAMICTEXTUR...

Documents