Date post: | 21-Jan-2016 |
Category: |
Documents |
Upload: | bonnie-hall |
View: | 215 times |
Download: | 0 times |
© 2009 Robert Hecht-Nielsen. All rights reserved.
1
Andrew SmithUniversity of California, San Diego
10.14.09
Building a Visual Hierarchy
© 2009 Robert Hecht-Nielsen. All rights reserved.
2
Outline
Building A Visual Hierarchy
Learning layer-by-layer
Inference – filling in a missing segment of an image
Examples\
Applications/Products & Future work
© 2009 Robert Hecht-Nielsen. All rights reserved.
3
Choosing an appropriate problem
We want to:
Model human visual processes.
Understand vision in terms of Confabulation Theory.
Build practical applications.
Begin basis for much deeper research.
Answer:
Build image modeling system.
Represent images in terms of textural components (low statistical order).
Represent images as symbolic (discrete) tuples.
© 2009 Robert Hecht-Nielsen. All rights reserved.
4Machine Vision vs. Biological Vision
Machine Vision
Pixels --- local representation.
Orthogonal
Biological Vision
Filter/Feature responses
Massively overcomplete/non-orthogonal
© 2009 Robert Hecht-Nielsen. All rights reserved.
5Confabulation & vision(Pixels → Modules & Symbols)
Features (symbols) develop in a layer of the hierarchy as commonly seen inputs from their inputs.
Knowledge links are simple conditional probabilities between symbols:
p(|) where and are symbols in connected modules
All knowledge can therefore be learned by simple co-occurrence counting.
p(|) = C(,) / C()
Confabulation operations:
Given evidence, find the answer that maximizes:
p(|) p(|) p(|) p(|)
© 2009 Robert Hecht-Nielsen. All rights reserved.
6
Building a vision hierarchy
• Can no longer use SSE to evaluate model
[ SSE maximizes p(|,,) ]
• Instead, make use of generative model:
– Always be able to generate a plausible image.
© 2009 Robert Hecht-Nielsen. All rights reserved.
7
Data set
• 4,300 1.5 Mpix natural images (BW)
© 2009 Robert Hecht-Nielsen. All rights reserved.
8
Vision Hierarchy – level “0”
We know the first transformation from neuroscience research: simple cells approximate Gabor filters.
5 scales, 16 orientations (odd + even)
Parameters picked to closely resemble feline simple cells.
Same approach is used elsewhere in lab. [Minnett, et al.]
© 2009 Robert Hecht-Nielsen. All rights reserved.
9
Vision Hierarchy – level “0”
• Does the full convolution preserve information in images? (inverted by LS)
• Very closely.
© 2009 Robert Hecht-Nielsen. All rights reserved.
10
Vision Hierarchy – level “0”
• We can do even better by super-sampling an image before encoding:
© 2009 Robert Hecht-Nielsen. All rights reserved.
11
Vision Hierarchy – level “0”
• Supersampling RMSE:
1x: 0.0202 2x: 0.0081 3x: 0.0051 4x: 0.0044 5x: 0.0038
© 2009 Robert Hecht-Nielsen. All rights reserved.
12
Inverting Gabor Representations
Studied by Daugman
Simple cells (found in 1950s) re-represent “pixel” data, were first characterized by Daugman as Gabor Logons in 1980's.
Attempted to answer “How much information is lost?”
“not much!” -- Able to completely reconstruct images.
(i.e. what we've just seen in previous few slides)
Frame Analysis can show:
Can mathematically prove when complete inversion is possible.
Optimal linear inverse.
© 2009 Robert Hecht-Nielsen. All rights reserved.
13
Vision Hierarchy – level 1
• We now have a simple-cell like representation.
• How to create a symbolic representation (“Complex Cells”)?
• Apply principle of Confabulation Theory: Collect common sets of inputs from simple cells: similar to a Vector Quantizer.
• Keep the 5-scales separate
– (quantize 16-dimensions, not 80)
© 2009 Robert Hecht-Nielsen. All rights reserved.
14
Vision Hierarchy – level 1
• To create actual symbols, we use a vector quantizer
– Trade-offs (threshold of quantizer) :
Number of symbols Preservation of information
Probability accuracy
• Solution Use angular distance metric (dot-product)– Keep only symbols that occurred in training set more than
200 times, to get accurate p().
– After training, ~95% of samples should be within threshold of at least one symbol.
– Pick a threshold so images can be plausibly generated.
© 2009 Robert Hecht-Nielsen. All rights reserved.
16
Vision Hierarchy – level 1
• Symbolic representation can generate plausible images:
• A theory of animal vision that actually demonstrates that animals can see!
© 2009 Robert Hecht-Nielsen. All rights reserved.
17
Vision Hierarchy – level 1
• ~8,000 symbols are learned for each of the 5 scales.
• Complex local features develop. (unlike PCA re-representations & ICA representations)
© 2009 Robert Hecht-Nielsen. All rights reserved.
18
Vision Hierarchy – level 1
• Now image is re-represented as 5 “planes” of symbols:
© 2009 Robert Hecht-Nielsen. All rights reserved.
19
Knowledge links:
• Learn which symbols may be next to which symbols (conditional probabilities)
• Learn which symbols may be over/under which symbols.
• Go out to ‘radius’ 7.
Consistent with cortical representation of knowledge
Very large (10s of GB) set of knowledge.
© 2009 Robert Hecht-Nielsen. All rights reserved.
20
Texture modeling – (inference)
What if a portion of our image symbol representation is damaged?
Blind spot
CCD defect
brain lesion
We can use confabulation (generation) to infer a plausible replacement.
© 2009 Robert Hecht-Nielsen. All rights reserved.
21
Texture modeling – Inference 1
• Fill in missing region by confabulating from lateral & different scale neighbors (rad 5).
© 2009 Robert Hecht-Nielsen. All rights reserved.
22
Texture modeling
© 2009 Robert Hecht-Nielsen. All rights reserved.
24
Texture modeling
© 2009 Robert Hecht-Nielsen. All rights reserved.
25
More Examples 1/7 (find the replacements)
© 2009 Robert Hecht-Nielsen. All rights reserved.
26
More Examples 1/7 (replacement locations)
© 2009 Robert Hecht-Nielsen. All rights reserved.
27
More Examples 2/7 (find the replacements)
© 2009 Robert Hecht-Nielsen. All rights reserved.
28
More Examples 2/7 (replacement locations)
© 2009 Robert Hecht-Nielsen. All rights reserved.
29
More Examples 3/7 (find the replacements)
© 2009 Robert Hecht-Nielsen. All rights reserved.
30
More Examples 3/7 (replacement locations)
© 2009 Robert Hecht-Nielsen. All rights reserved.
31
More Examples 4/7 (find the replacements)
© 2009 Robert Hecht-Nielsen. All rights reserved.
32
More Examples 4/7 (replacement locations)
© 2009 Robert Hecht-Nielsen. All rights reserved.
33
More Examples 5/7 (find the replacements)
© 2009 Robert Hecht-Nielsen. All rights reserved.
34
More Examples 5/7 (replacement locations)
© 2009 Robert Hecht-Nielsen. All rights reserved.
35
More Examples 6/7 (find the replacements)
© 2009 Robert Hecht-Nielsen. All rights reserved.
36
More Examples 6/7 (replacement locations)
© 2009 Robert Hecht-Nielsen. All rights reserved.
37
Texture modeling
Conclusions
This visual hierarchy does an excellent job at capturing an image up to a certain order of complexity.
Given this visual hierarchy and its learned knowledge links, missing regions could plausibly be filled in. This could be a reasonable explanation for what animals do.
Preparing for publication (IEEE Transactions on Image Processing), with help from Professor Serge Belongie (CSE).
Last hurdle to graduation!
© 2009 Robert Hecht-Nielsen. All rights reserved.
44
The next level…
Level 2 symbol hierarchy
• Collect commonly recurring regions of level 1 symbols.
• Symbols at Level 2 will fit together like puzzle pieces.
Thank you!