1G52IIP, School of Computer Science, University of Nottingham
Storing and Retrieving Images
Content-based Image/Video Indexing and Retrieval
Selected Advanced Topics
2G52IIP, School of Computer Science, University of Nottingham
Problem
Find all images contain horses …..
3G52IIP, School of Computer Science, University of Nottingham
Text-based technology Annotation: Each image is indexed
with a set of relevant text phrases, e.g.,
Retrieval: based on text search technology
Appropriate phrases to describe the content of this image include:
Mother, Child, Vegetable, Yellow, Green, Purple ….
4G52IIP, School of Computer Science, University of Nottingham
Text-based technology - Drawbacks
Annotation - subjectivedifferent people may use different
phrases to describe the same or very similar image/content
5G52IIP, School of Computer Science, University of Nottingham
Text-based technology - Drawbacks
Annotation - Laborious
It will take a lot of man-hours to label large image/video databases with 1m+ items
6G52IIP, School of Computer Science, University of Nottingham
Content-based Technology
Using Visual Examples
7G52IIP, School of Computer Science, University of Nottingham
Content-based Technology
r% g% b%
Using Visual Features
8G52IIP, School of Computer Science, University of Nottingham
Content-based Technology
Content-based image indexing and retrieval (CBIR), is an image database management technique, which indexes the data items (images, or video clips) using visual features (e.g., color, shape, and texture) of the images or video clips.
A CBIR system lets users find pictorial information in large image and video databases based on visual cues, such as colour, shape, texture, and sketches.
9G52IIP, School of Computer Science, University of Nottingham
Content-based Technology
The visual features, computed using image processing and computer vision techniques are used to represent the image contents numerically.
Image Content - a high level concept, e.g., this image is a sunset scene, a landscape scene, etc.
Numerical Content Representations - Low level numbers, often the same set of numbers can come from very different images, making the task very hard!
10G52IIP, School of Computer Science, University of Nottingham
Content-based Technology
Techniques for Computing Visual Features/Representing Image Contents –
some are very sophisticated, and many are still not matured
hence the computational processes in some cases are automatic but in other cases are semi-automatic in the most difficult cases, it may have to be done
manually
11G52IIP, School of Computer Science, University of Nottingham
Content-based Technology
Comparing Image Content/Retrieving Images based on Content
Simple approaches - compute the metric distance between low level numerical representations
Advanced Approaches - using sophisticated pattern recognition, artificial intelligence, neural networks, and interactive (relevant feed-back) techniques to compare the visual content (low level numerical features)
12G52IIP, School of Computer Science, University of Nottingham
Content-based Technology - IBM QBIC System
The IBM’s QBIC (Query by Image and Video
Content) system is one of the early examples
of CBIR system developed in 1990s.
The system lets users find pictorial information
in large image and video databases based on
color, shape, texture, and sketches.
13G52IIP, School of Computer Science, University of Nottingham
Content-based Technology - IBM QBIC System
The User Interfaces Module Let user specify visual query by drawing, selecting from a color
wheel, selecting a sample image …
Display results as an ordered set of images
The Database Population and Database Query Modules Database population - process images and video to extract
features describing their content - colors, textures, shapes and camera and object motion, and store the features in a database
Database Query - let user compose a query graphically, extract features from the graphical query, input to a matching engine that finds images or video clips with similar features
14G52IIP, School of Computer Science, University of Nottingham
Content-based Technology - IBM QBIC System
The Data Model
Still image, or scene - full image Objects contained in the full image - subsets of an
image Videos - broken into clips called shots - sets of
contiguous frames Representative frames, the r-frames, are generated
for each shot R-frames are treated as still image - from which
features are extracted and stored in the database. Further processing of shots generates motion objects
- e.g., a car moving across the screen.
15G52IIP, School of Computer Science, University of Nottingham
Content-based Technology - IBM QBIC System
Queries are allowed on
Objects - e.g., Find images with a red round object Scenes - e.g., Find images that have approximately
30% red and 15% blue colors Shots - e.g., Find all shots panning from left to right A combination of above - e.g., Find images that have
30% red and contain a blue textured objects
16G52IIP, School of Computer Science, University of Nottingham
Content-based Technology - IBM QBIC System
Similarity Measures
Similarity queries are done against the database of pre-computed features using distance functions between the features
Examples include, Euclidean distance, City-block distance, ….
These distance functions are intended to mimic human perception to approximate a perceptual ordering of the database
But, it is often the case that a distance metric in a feature space will bear little relevance to perceptual similarity.
17G52IIP, School of Computer Science, University of Nottingham
Content-based Technology - Basic Architecture
Meta data Imagery
color texture shape positions ….Record1
color texture shape positions ….Record2
color texture shape positions ….Record3
color texture shape positions ….Record4
color texture shape positions ….Record n
color
texture
shape
positions
….
Image DatabaseQuery
Similarity Measures
18G52IIP, School of Computer Science, University of Nottingham
Colour - An effective Visual Cue
Colors can be a more powerful visual cue than you initially thought!
What soft drink
Which fruit?
19G52IIP, School of Computer Science, University of Nottingham
Colour - An effective Visual Cue
In many cases, color can be very effective.
Here is an example
Results of content-based image retrieval using 4096-bin color histograms
20G52IIP, School of Computer Science, University of Nottingham
Colour SpacesColour Models
RGB Model: This colour model uses the three NTSC primary colours to describe a colour within a colour image.
R
G
B
Yellow
Cyan
Magenta White
gray
Sometimes in Computer Vision, it isconvenient to use rg chromaticity space
r = R/(R+G+B)g= G/(R+G+B)
21G52IIP, School of Computer Science, University of Nottingham
Colour SpacesYIQ Model: The YIQ models is used in commercial colour TV broadcasting, which is a re-coding of RGB for transmission efficiency and for maintaining compatibility with monochrome TV standard.
B
G
R
Q
I
Y
331.0523.0212.0
321.0275.0596.0
114.0587.0299.0
In YIQ, the luminance (Y) and colour information (I and Q) are de-coupled.
YCbCr Model
Y = 0.299R + 0.587G + 0.114BCb = -0.169R - 0.331G + 0.500BCr = 0.500R - 0.419G - 0.081B
22G52IIP, School of Computer Science, University of Nottingham
Perceived Color Differences
One problem with the RGB colour system is that colorimetric distances between the individual colours don't correspond to perceived colour differences.
For example, in the chromaticity diagram, a difference between green and greenish-yellow is relatively large, whereas the distance distinguishing blue and red is quite small. r = R/(R+G+B)
g= G/(R+G+B)
23G52IIP, School of Computer Science, University of Nottingham
CIELAB CIE (Commission Internationale
de l'Eclairage) solved this problem in 1976 with the development of the Lab colour space. A three-dimensional color space was the result. In this model, the color differences which you perceive correspond to distances when measured colorimetrically. The a axis extends from green (-a) to red (+a) and the b axis from blue (-b) to yellow (+b). The brightness (L) increases from the bottom to the top of the three-dimensional model.
With CIELAB what you see is what you get (in theory at least).
24G52IIP, School of Computer Science, University of Nottingham
Colour Histogram
Given a discrete colour space defined by some colour axes (e.g., red, green, blue), the colour histogram is obtained by discretizing the image colours and counting the number of times each discrete colour occurs in the image.
The image colours that are transformed to a common discrete colour are usefully thought of as being in the same 3D histogram bin centered at that colour.
25G52IIP, School of Computer Science, University of Nottingham
Colour Histogram Construction
Step 1
Colour quantization (discretizing the image colours)
Step 2
Count the number of times each discrete colour occurs in the image.
26G52IIP, School of Computer Science, University of Nottingham
Colour Quantization
A true colour, 24-bit/pixel image (8 bit - R, 8 bit - G, 8 bit -B), will have 224 = 16777216 bins !
That is, each image will have to be represented by over 16 million numbers
computationally impossible in practice not necessary
Colour quantization - reduce the number of (colours) bins
27G52IIP, School of Computer Science, University of Nottingham
Simple Colour Quantization
Simple Colour Quantization (Non-adaptive)
Divide each colour axis into equal length sections (different axis can be divided differently).
Map (quantize) each colour into its corresponding bin
28G52IIP, School of Computer Science, University of Nottingham
Simple Colour Quantization
0 31 63 95 127 159 191 223 255
0 31 63 95 127 159 191 223 255
0 31 63 95 127 159 191 223 255
R
G
B
Colour Bin Colour Bin
(123,23,45) (3, 0, 1 ) (122, 28, 46) (3, 0, 2)
(132, 29,50) (4, 0, 1) (122, 172, 27) (3, 5, 0)
(121,26,48) (x, x, x) (142, 28, 46) (x, x, x)
Example: In RGB space, quantize each image colour into one of 8x8x8 = 512 colour bins
29G52IIP, School of Computer Science, University of Nottingham
Advanced Colour Quantization Adaptive Colour quantization (Not
required)
Vector Quantization K-means clustering K representative colours The colour histogram consists of K bins,
each corresponding to one of the representative colours.
A pixels is classified as belonging to the nth bin if the nth representative colour is the one (amongst all the representative colours) that is closest to the pixel.
R
G
B
A pixel is a point in the 3D colour space
Representative colours
30G52IIP, School of Computer Science, University of Nottingham
Colour Histogram Construction - An Example
A 3 x 3, 24-bit/pixel image has following RGB planes
Construct an 8-bin colour histogram (using simple colour quantization, treating each axis as equally important).
Red
23 24 7711 24 6922 12 12
Green
213 24 7711 232 23922 12 12
Blue
23 24 7712 24 6922 123 123
Bin (0,0,0) = Bin (0,0,1) = Bin (0,1,0) = Bin (0,1,1) =
Bin (1,0,0) = Bin (1,0,1) = Bin (1,1,0) = Bin (1,1,1) =
31G52IIP, School of Computer Science, University of Nottingham
Colour Histogram Construction - An Example
Quantized Colour Planes
Count the number of times each discrete colour occurs in the image.
Red
0 0 00 0 00 0 0
Green
1 0 00 1 10 0 0
Blue
0 0 00 0 00 0 0
Bin (0,0,0) = 6 Bin (0,0,1) = 0 Bin (0,1,0) = 3 Bin (0,1,1) = 0
Bin (1,0,0) = 0 Bin (1,0,1) = 0 Bin (1,1,0) = 0 Bin (1,1,1) = 0
32G52IIP, School of Computer Science, University of Nottingham
Colour Based Image Indexing
# of
pix
els
# of pixels
10000100103000
Color Distribution = (10,0,0,0,100,10,30,0,0)
The histogram of colours in an image defines the image colour distribution
33G52IIP, School of Computer Science, University of Nottingham
Colour based Image Retrieval
Images are similar if their histograms are similar!
Colour Distribution = (10,0,0,0,100,10,30,0,0)
Colour Distribution = (10,0,0,0,90,10,40,0,0)
Colour Distribution = (0,40,0,0,0,0,0,0,110,0)
Dissimilar
Similar!
34G52IIP, School of Computer Science, University of Nottingham
Formalizing Similarity
Colour Distribution = (10,0,0,0,100,10,30,0,0) 1H
Colour Distribution = (0,40,0,0,0,0,0,110,0)
2H
Similarity(Image 1, Image 2) = D (H1, H2)
where D( ) is a distance measure between vectors (histograms) H1 and H2
1
2
35G52IIP, School of Computer Science, University of Nottingham
Metric DistancesA distance measure D( ) is a good measure if it is a metric!
D(a,b) is a metric if
D(a,a) = 0 (the distance from a to itself is 0
D(a,b) = D(b,a) (the distance from a to b = distance from b to a)
D(a,c) <= D(a,b) + D(b,c) ( triangle inequality [ the straight line distance is always the least!] )
D(a,b) + D(b,c) should be no smallerthan D(a,c)
a
b
c
36G52IIP, School of Computer Science, University of Nottingham
Common Metric Distance measures
n
iii HH
1
21 ),min(HI(H1, H2) =
H1 = (10, 0, 0, 0, 100, 10, 30, 0, 0)
H2 = ( 0, 40, 0, 0, 0, 6, 0, 110, 0)
Similarity = HI(H1, H2) = 0 + 0 + 0 + 0 + 0 + 6 + 0 + 0 = 6
Histogram Intersection, HI
37G52IIP, School of Computer Science, University of Nottingham
Common Metric Distance measures
Euclidean or straight-line distance or L2-norm, D2
2
21221212 ),( HHHHHHDi
ii Root-mean square
error
H1 = (10, 0, 0)
H2 = ( 0, 40, 0)
Similarity = D2(H1, H2) = sqrt(100 + 1600 +0) = 41.23
38G52IIP, School of Computer Science, University of Nottingham
Common Metric Distance measures
Manhattan or city-block or L1-norm, D1
1
2121211 ),( HHHHHHDi
ii sum of absolute
differences
H1 = (10, 0, 0)
H2 = ( 0, 40, 0)
Similarity = D1(H1, H2) = (10 + 40 +0) = 50
39G52IIP, School of Computer Science, University of Nottingham
Histogram Intersection vs City Block Distance
Theorem: if H1 and H2 are colour histograms and the total countin each is N (there are N-pixels in an image) then:
1
2121 5.0 HHHHN (Histogram Intersection inversely proportional to a metric distance!)
Proof
i
ii HHHH ),min( 2121 (by definition)
1),max(),min( bababa
(1)
(2)
40G52IIP, School of Computer Science, University of Nottingham
Histogram Intersection vs City Block Distance
),min(),max( bababa (3)
Substituting (2) and (3) in (1)
i i iiiiii
ii
iiiii
iii
HHHHHH
HHHHHH
1
212121
1
212121
),min(
),max(),min(
(4)
1
2121 ),min(2 HHNNHH iii
(5)
1
2121 5.0 HHHHN
41G52IIP, School of Computer Science, University of Nottingham
BuildHistogram
HistogramDatabase
(1)
(2)
Colour Histogram Database
42G52IIP, School of Computer Science, University of Nottingham
How well does Color histogram intersection work ?
66 test histograms in the database
31 query images
Recognition rate almost 100%
Indeed, because color indexing worked so well it is athe heart of almost all image database systems
Swain Original Test:
43G52IIP, School of Computer Science, University of Nottingham
Google Image Search
44G52IIP, School of Computer Science, University of Nottingham
Google Image Search
After clicking this colour patch
45G52IIP, School of Computer Science, University of Nottingham
Problems with color histogram matching
1. Color Quantization problem:
Colour Distribution = (0,40,0,0,0,0,0,110,0)
Colour Distribution = (0,0,40,0,0,0,0,0,110)
Because, thetwo images haveslightly different color distributionstheir histograms havenothing in common!
0 intersection!
Sources of quantization error: noise, illumination, camera
46G52IIP, School of Computer Science, University of Nottingham
Problems with color histogram matching
2. The resolution of a color histogram
Colour Distribution = (0,40,0,0,0 … ,0,0,110,0)
For the best results, Swain quantized colourspace into 4096 distinct colours => Each colourdistribution is a 4096-dimensional vector.
=> Histogram intersection costs O(4096) operations(some constant * 4096)
4096 comparisons per database histogram => histogram intersectionwill be very slow for large databases
Many newer methods work well using 8 - 64 D features
47G52IIP, School of Computer Science, University of Nottingham
Problems with color histogram matching
3. The colour of the light
Under a yellowish light all image colours aremore yellow than they ought to be
48G52IIP, School of Computer Science, University of Nottingham
Problems with color histogram matching
All four images have the same color distribution - need to take into account spatial relationships!
4. The structure of colour distribution
49G52IIP, School of Computer Science, University of Nottingham
Problem solution => Use statistical moments
1st order statistics
=Mean BGR ,,
=Variance/Covariance GBRBRGBGR ,,,,, 222
2nd order statistics
22 )(/1 Ri
iR RN ))((/1 GiRi
iRG GRN
50G52IIP, School of Computer Science, University of Nottingham
Statistical similarity
Colour Distribution = (50,50,50)
Colour Distribution = (20,70,40)
....][...)]()([)]()([ 222221
221 GRGGRR IIII Statistical similarity =
Compare mean RGBs(In general compare all statistical measures)
(Euclidean distance between corresponding statistical measures)
51G52IIP, School of Computer Science, University of Nottingham
Histogram vs Statistical Similarity
histogram
Low order stats
Low and highorder stats
Completeness ofrepresentation
#params/Match speed
Sensitivity to Quantization error
Many/slow
Few/fast
Many/moderate
complete
incomplete
complete(or over complete)
sensitive
insensitive
sensitive
52G52IIP, School of Computer Science, University of Nottingham
Advanced Topics
Fast Indexing Interactive/Relevant Feedback Reducing the Semantic Gap Visualization, Navigation, Browsing Internet scale image/video retrieval
Flickr – billions of photos Youtube – billions of videos …