Object Recognition in the Dynamic Link Architecture

Object Recognition in the Object Recognition in the Dynamic Link ArchitectureDynamic Link Architecture

Yang RanYang Ran

CMPS 828JCMPS 828J

23/4/19 2

OutlineOutline

Background and IntroductionBackground and Introduction System OverviewSystem Overview General algorithm in detailsGeneral algorithm in details Implementations of the algorithmImplementations of the algorithm Experiment resultsExperiment results Further readings and conclusionFurther readings and conclusion

23/4/19 3

BackgroundBackground

1.1. Problem: To recognize human faces Problem: To recognize human faces from single images our of a large from single images our of a large gallery.gallery.

2.2. Challenges: Distortions in terms of Challenges: Distortions in terms of position, size , expression, and poseposition, size , expression, and pose

3.3. Existed methods: Existed methods: Appearance Based v.s. Shape based Appearance Based v.s. Shape based 2D vs. 3D2D vs. 3D

23/4/19 4

Background: NotationsBackground: Notations

1.1. Image: face image Image: face image

2.2. Model: face galleryModel: face gallery

3.3. Graph: a concise face descriptionGraph: a concise face description

4.4. Jet: A local description of the Jet: A local description of the distribution based on the Gabor distribution based on the Gabor transformtransform

23/4/19 5

System OverviewSystem Overview

1.1. Faces are Faces are represented represented as as rectangular rectangular graphs by graphs by layers of layers of neuronsneurons

2.2. Each Each neuron neuron represents represents a node and a node and has a jet has a jet attachedattached

23/4/19 6

AssumptionsAssumptions

The image domain and the model The image domain and the model domain are bi-directionally connected domain are bi-directionally connected by dynamic links. by dynamic links. These connections are plastic on a fast These connections are plastic on a fast time scale, changing radically during a time scale, changing radically during a single recognition eventsingle recognition eventThe strength of a connection between The strength of a connection between any two nodes in the image and a any two nodes in the image and a model is controlled by the jet similarity model is controlled by the jet similarity between them, which roughly between them, which roughly corresponds to the number of features corresponds to the number of features that are common to the two nodesthat are common to the two nodes

23/4/19 7

Key FactorsKey Factors

Basic representation is the labeled Basic representation is the labeled graph formed by edges and vertices graph formed by edges and vertices bundled in bundled in jetsjets Edge Labels: distance informationEdge Labels: distance information Vertex/Node Labels: wavelet responsesVertex/Node Labels: wavelet responses

Graph should be able to deform to Graph should be able to deform to adapt to the variations of human adapt to the variations of human facesfaces

23/4/19 8

Preprocessing by Gabor Preprocessing by Gabor WaveletsWavelets

Gabor Wavelets are biological Gabor Wavelets are biological motivated convolution kernels in the motivated convolution kernels in the shape of plane waves restricted by shape of plane waves restricted by Gaussian envelope functionGaussian envelope function

23/4/19 9

More for GaborMore for Gabor

Why use it?Why use it? A good approximation to the sensitivity A good approximation to the sensitivity

profiles of neurons found in visual cortex of profiles of neurons found in visual cortex of higher vertebrateshigher vertebrates

Cells come in pair with even and odd Cells come in pair with even and odd symmetry like the real and imagery part of symmetry like the real and imagery part of Gabor FilterGabor Filter

23/4/19 10

Jets GenerationJets Generation

1.1. The set of convolution coefficients The set of convolution coefficients for kernels and frequencies at one for kernels and frequencies at one image pixel is called a jetimage pixel is called a jet

2.2. Describes a small patch of gray Describes a small patch of gray values around a given pixelvalues around a given pixel

3.3. Sample W at five logarithmically Sample W at five logarithmically spaced f levels and eight directions spaced f levels and eight directions by u, vby u, v

23/4/19 11

Jets Generation-cnt’lJets Generation-cnt’l

The magnitude of (WI) (kThe magnitude of (WI) (kuvuv, x) form a , x) form a feature vector located at x, which will feature vector located at x, which will be referred to as a jetbe referred to as a jet

Evaluate the similarity by Elastic Evaluate the similarity by Elastic Graph Matching:Graph Matching:

23/4/19 12

Edge LabelsEdge Labels

Derived from neuron version, edges Derived from neuron version, edges encodes neighborhood relationshipsencodes neighborhood relationships

Presents the topology of the verticesPresents the topology of the vertices DefineDefine

Quadratic comparison functionQuadratic comparison function

23/4/19 13

ExampleExample Graph representation of a faceGraph representation of a face

23/4/19 14

Elastic Graph MatchingElastic Graph Matching

Elastic matching of a model graph M to a target graph I amounts to a search for a set of vertex positions which simultaneously optimizes the matching of vertex labels and edge labels according to:

23/4/19 15

Elastic Graph Matching-cnt’lElastic Graph Matching-cnt’l

A heuristic algorism is seek to close the A heuristic algorism is seek to close the optimum within a reasonable timeoptimum within a reasonable timeStep 1: find approximate face position so Step 1: find approximate face position so that the image can be scaled and cut to that the image can be scaled and cut to standard sizestandard sizeStep 2: Extract Step 2: Extract graphgraph from target face from target face imageimageStep 3: Match with cost functionStep 3: Match with cost function

Refine position and size with Refine position and size with λ λ = infinity= infinityLocal distortionLocal distortion

23/4/19 16

ExperimentsExperiments

Data Base Data Base Technical Aspects Technical Aspects ResultsResults Conclusions Conclusions

23/4/19 17

Data BaseData Base

As a face data base we used galleries of As a face data base we used galleries of 111 different persons. Of most persons 111 different persons. Of most persons there is one neutral frontal view, one there is one neutral frontal view, one frontal view of different facial expression, frontal view of different facial expression, and two views rotated in depth by 15 and and two views rotated in depth by 15 and 30 degrees respectively.30 degrees respectively.

23/4/19 18

Technical AspectsTechnical Aspects

The CPU time needed for the The CPU time needed for the recognition of one face against a recognition of one face against a gallery of 111 models is gallery of 111 models is approximately 10--15 minutes on a approximately 10--15 minutes on a Sun SPARCstation 10-512 with a 50 Sun SPARCstation 10-512 with a 50 MHz processor.MHz processor.

23/4/19 19

Results-Office ItemsResults-Office Items

23/4/19 20

Comparison of Two GalleriesComparison of Two Galleries

23/4/19 21

More ResultsMore Results

23/4/19 22

More Results-cnt’lMore Results-cnt’l

23/4/19 23

Recognition Results Against Recognition Results Against GalleriesGalleries

Recognition results against a gallery of 20, 50, and 111 neutral frontal views

23/4/19 24

ConclusionConclusion

Close to natural model: a small Close to natural model: a small number of examples is needed for number of examples is needed for face recognitionface recognition

Gabor Wavelets representation are Gabor Wavelets representation are robust to moderate lighting robust to moderate lighting changes, shifts and deformationschanges, shifts and deformations

Elastic Graph Matching in Dynamic Elastic Graph Matching in Dynamic Link Architecture is robust in face Link Architecture is robust in face recognitionrecognition

23/4/19 25

ConclusionConclusion

1.1. Having only several images per Having only several images per person in gallery does not provide person in gallery does not provide sufficient information to handle 3D sufficient information to handle 3D rotationrotation

2.2. Rectangle grid v.s. Feature pointsRectangle grid v.s. Feature points

23/4/19 26

ReferencesReferences

1.1. M. Lades, J.C. Vorbruggen, J. Buhmann, J. M. Lades, J.C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R.P. Wurtz, Lange, C. von der Malsburg, R.P. Wurtz, W. Konen. W. Konen. Distortion Invariant Object Distortion Invariant Object Recognition in the Dynamik Link Recognition in the Dynamik Link ArchitectureArchitecture. IEEE Transactions on . IEEE Transactions on Computers 1992, 42(3):300-311.Computers 1992, 42(3):300-311.

2.2. Laurenz Wiskott, Jean-Marc Fellous, Laurenz Wiskott, Jean-Marc Fellous, Norbert Krüger, et al. Norbert Krüger, et al. Face Recognition Face Recognition by Elastic Bunch Graph Matching,by Elastic Bunch Graph Matching, Proc. Proc. 7th Intern. Conf. on Computer Analysis of 7th Intern. Conf. on Computer Analysis of Images and Patterns, CAIP'97, KielImages and Patterns, CAIP'97, Kiel

Date post:	31-Dec-2015
Category:	Documents
Upload:	raven-barker
View:	39 times
Download:	2 times

Object Recognition in the Dynamic Link Architecture

Documents