Date post: | 19-Dec-2015 |
Category: |
Documents |
View: | 218 times |
Download: | 5 times |
Image Database Retrieval
Krisztián Veréb PhDDepartment of Information Technology
Faculty of Computer Science and Information Technology University of Debrecen
IPCV 2006 Budapest
Image Databases
• Image databases can– Store images– Manage images (process)– Retrieve images
IPCV 2006 Budapest
Databases
• Databases can store images– As BLOB– As Object
• Databases can process images– With outer (external) methods– With inner (internal) methods
• To retrieve images they use– Content-Based Image Retrieval– Visual Information Retrieval
IPCV 2006 Budapest
Real Image Databases
• Image databases can– Store images in the database level
(with native image objects)– Manage images in the database level (it comes
from the native objects)– Retrieve images in the database level
(with native stored procedures)
IPCV 2006 Budapest
Image Database Facilities
• Storing, sorting, managing large amount of images
• Designed and planned management of related information
• Image processing• Image similarity• Image visualization
IPCV 2006 Budapest
Function of Image Databases
• Extensional role– In image processing tools (Léna)– In medical image processing (CT, MRI)– In art collections (painting)– In historical archives (cronology with images)
• Central role– Data BASED systems (police)– Visulal Information Systems– General image databases
IPCV 2006 Budapest
Characteristics of General DB
• Similarity – Supports the visual similarity measuring
• Generality– The domain of usage is not restricted (it is not computer
vision, it is not image processing)
• Interaction– There are input and output, and it can used as
information systems
• Data complexity– Large amount of various data (Image, feature vector,
stored procedure as well as the familiar database types)
IPCV 2006 Budapest
Expectations and Solutions
IPCV 2006 Budapest
• Stability – Usage of proven (old) techniques
• High rate– Usage of small, ‘dumb’ techniques
• Low cost– Usage of legacy systems
Connection of SciencesRGB, HSI, YUV, hisztogramm, Minkowski Jeffrey distance, Gauss-Markov fields, texture, autocorrelation, Fourier transform, mathematical morphology, image segmentation, neighbourhood, Freeman code, Hausdorff distance, Affine transformation, attentive and preattentive features, Lp metrics, Fuzzy logics, computer vision
Binary Large Object (BLOB), BFILE, structural data, table spaces, content management, SQL (score) query, multimedia indexing, space partitions, data partitions, semantical indexing, ORDSource, ORDVIR (8.1.7), ORDImage (9i), Oracle Still Image (10g), ORD class hierarchy, PLSQL, XML, PSP, DB2 QBIC, Sigma-QL, data mining,
CBIR VIR
IPCV 2006 Budapest
Image Storing
• Storing only images– Albums, galeries, archives
• Storing images with related information– Painting, Police databases
• Storing text with related images– on-line books, articles, publication
IPCV 2006 Budapest
Image Database Users
• Specific enquierer– I need the famous image of Lena
• General enquierer– I need some pictures of Lena
• The story teller enquierer– I need picures about image processing, for
segmentation on that girl, you know…
• The story giver enquierer– I need pictures about image processing…
• The space-filler enquierer– I need a picture to fill the empty space in my article
IPCV 2006 Budapest
Main Retrieval Concepts
• Text-based image retrieval– linguistical
• Image-based text retrieval– investigational
• Image-based image retrieval– CBIR
IPCV 2006 Budapest
CB Image Comparing
• Based on feature vectors– Representation– Characteristic
• Using special interfaces– Similar picture based– Sketch based– Iconic
• The output is a similarity number given by the matching algorithms
IPCV 2006 Budapest
Working of the Retrieval
Query image
Feature vector
Matching
Feature vector
Candidate images
Result set
IPCV 2006 Budapest
Main Matching Properties
• Vector extraction:• Image distance:• Vector distance:
• Distance properties:
IPCV 2006 Budapest
Main Query Types
• Exact (identical) query:
• Epsilon query:
• Nearest neighbour query:
IPCV 2006 Budapest
Feature (vectors)
• Color– Local and global histograms
• Shape– Patches, segments
• Texture– Statistics of the pixel color changes
• Geometrical location– The spatial organization of the above
mentioned features
IPCV 2006 Budapest
Color Based Retrieval (1)
• Mainly based on color histogramsMinkowski distance:
Jeffrey distance:
Bhattacharyya distance (normalized):
Intersection distance:
IPCV 2006 Budapest
Color Based Retrieval (2)
• In case of spatial query the histograms can be computed locally
• It is common, that images are cut into n pieces (usually n = 9), and thus we have n + 1 histograms (local and global)
IPCV 2006 Budapest
Texture Based Retrieval (1)
• Mainly based on the co-occurance matrix
Pd(i,j)=|{ (p1,p2) : p2=p1+d, f(p1)=i, f(p2)=j }|
So it is the number of occurances of the pair of gray levels i and j which distance d apart.
IPCV 2006 Budapest
Texture Based Retrieval (2)
• Energy:
ΣiΣj (Pd(i,j))2 • Entropy:
- ΣiΣj Pd(i,j)logPd(i,j)• Contrast:
ΣiΣj (i-j)2Pd(i,j)• Homogeneity:
ΣiΣj (Pd(i,j)/(1+|i-j|)• Correlation:
ΣiΣj (i-E(ΣjPd(x,j)))(j-E(ΣiPd(i,y)))Pd(i,j)/
(σ(ΣjPd(x,j))σ(ΣiPd(i,y)))
IPCV 2006 Budapest
Shape Based Retrieval (1)
• Mainly based on segmentation.
• First, the image has to be segmented patches (the small patches are considered as noise).
• Then the distance of the query and the candidate patches has to be comuted.
IPCV 2006 Budapest
Shape Based Retrieval (2)
• Boundary techniques– Boundary evolution (iteratively smooth the
boundaries while they will equals, the distance is the number of smooting)
– Statistical method (the chains are considered as samples of random variables, and the distance is a homogenity test)
– Stochastical method (the boundary is considered as a trajectory of a stochastic process, and the distance is the probability of the matching of the observed boundary and the computed one)
IPCV 2006 Budapest
Shape Based Retrieval (3)
• Point set techniques– Area of overlap and symmetric difference
area(A∩B) and area((A\B)U(B\A))
– Hausdorff distance
dH(A,B)=max{δ(A,B),δ(B,A)}
where
δ(A,B)=maxaAminbBd(a,b)
IPCV 2006 Budapest
Spatial Query
• The structural features (geometrical location) can be represented with graphs– Graph transformation
• The graph can be represented with a neigbourhood matrix N(nn)
– The spatial distance is the distance of the principal components in
N=AΛV’
IPCV 2006 Budapest
Distance and Similarity (1)
• Distance is a non-negative real number• In case of metric spaces it meets the
properties:– Self similarity– Minimality– Simmetricity– Triangle inequality
• Spaces in image databases usually non-metric spaces (without triangle inequality)
IPCV 2006 Budapest
Distance and Similarity (2)
• Similarity is a number between 0 and N. N means the images are equal. (Usually N = 1 ) It meets the properties:– Self similarity– Simmetricity
• Similarity can be computed from the distance• Distance cannot be computed from the
similarity (in general)• Image databases usually use similarity
IPCV 2006 Budapest
Score (1)
• The score is a number, representing the final distance or similarity for a user query
• E.g. it can be a weighted sum for distances d1 d2 d3 d4 (in case of four feature)
d = w1d1 + w2d2 + w3d3 + w4d4
IPCV 2006 Budapest
Score (2)
• In case of similarities it can be used fuzzy techniques
• E.g. for similarities s1 s2 s3 s4 (in case of four feature)
s = s1 s2 s3 s4
where
x y = max { 0, x + y – 1 }
or in case of weighting
wx qy = max { wx, qy }
IPCV 2006 Budapest
Interfaces
• Iconic– Icons represent features– Spatial friendly
• Sketch-based– Skecth is a ‘hand draw’– Shape friendly
• Similar picure based– Unknow source – Usually color friendly, but …
IPCV 2006 Budapest
Indexing
• Mathematical partitioning of the vector space– Data partitioning– Space partitioning
• Semantical prefiltering of the candidate images– Identifying the ‘important’ features– Classifying the objects
IPCV 2006 Budapest
Tree Indexing
• Common techniques:
– Quadtree (Spatial)– R-tree (MM)– Kd-tree (MM)– B-tree (Legacy)
IPCV 2006 Budapest
Other Indexing Techniques
• Text indexing– From the linguistic representation using
bitvector indexing technique
• OO type-tree indexing– From the OO representation using existing
indexes linked by the inheritance tree
IPCV 2006 Budapest
Linguistic Features (1)
• Colours:– Red, White, etc.
• Shapes:– Triangle, rectangle, etc.
• Textures:– Striped, dotted
• Spatial:– Disjoint, overlap, covers-covered,
contains, inside, equal
IPCV 2006 Budapest
Linguistic Features (2)
• Modifiers:– Leopard (-spotted), silver (-gray)
• Moods (feels):– Blue, sad, comic, depressed
• Themes:– Art, film, painting, sculpture
• Objects:– Cars, flowers, buildings
IPCV 2006 Budapest
Linguistic Representation• Meta relations
– For fact, modifier and mood classesR( Word, Word-class )
• Data relations– Simple
R1( Fact, Weight, Image-ID )
– ModifiedR2( Fact, Modifier, Image-ID )
– BinaryR3( Fact1, Fact2, Link, Image-ID )
– TernaryR4( Fact1, Fact2, Fact3, Image-ID )
IPCV 2006 Budapest
Linguistic querying
• With simple SELECT SQL statements
select Image-ID from R, R1, R2 where …
• With SLD resolutionfacts( fact1, image ).rules( Fact1, Fact2, Link, Image ) :-
facts( Fact1, Image), facts(Fact2, Image) …?- rules( fact1, fact2, link, Image ).
IPCV 2006 Budapest
Spaces in Image Databases (1)
• The user makes a query q in – space Q (Query)
• Using an interface – space C (Composition)
• The image feature vectors are in– space F (Feature)
• The systems gives the result set from F using q in – space O (Output)
• It has to be displayed in – space D (Display)
IPCV 2006 Budapest
User Interface
Matching
space F
Spaces in Image Databases (2)
space C
space Q space O
space D
User
IPCV 2006 Budapest
About Space O (1)
• Images:
• Matching:
• Weights:
• A similarity result:
• Results for a query:
IPCV 2006 Budapest
The Line Model (1)
• Images f are ordered by r(q) into a line• It’s well applicable in case of weights• It has good feature to show text with images
as well• It has poor navigation feature• It has small computational time
IPCV 2006 Budapest
The Matrix Model (1)
• Images f are ordered by r(q) into a line, and then they are grouped by 9 images
• It’s well applicable in case of weights• It has good feature space improvement
(9 images are shown in the same time) • It has better (but still poor) navigation feature• It has small computational time
IPCV 2006 Budapest
The Fish-eye Model (1)
• The nD space O has to be pre-projected into 2D• The 2D space are stretched onto a hemisphere• It doesn’t need to use weighting• It has good feature space improvement • It has great navigation feature
(by rolling the stretched space on the hemisphere)
• It is good in indicating clusters• It has big computational time
IPCV 2006 Budapest
The Star Model (1)
• It doesn’t need a pre-projection
• It doesn’t need to use weighting
• It has good feature space improvement
• It has great navigation feature
• It has small computational time
• It isn’t good in indicating clusters
IPCV 2006 Budapest
Comparison of the Techniques
Line model - - + - +
Matrix model + - - - +
Fish-Eye model + + - + -
Star model + + - - +
Space Improvement
Navigation Feature
Extra Information
Indicate Clusters
Fast Computation
IPCV 2006 Budapest
Actual Questions
• ORDBMS vs web-based archives?• Decreasing the number of candidate images.• Low-level features vs semantical ones?• Use one-step queries or navigate step-by-
step?• Problem of question formalization and query
languages.• Using general algorithms or special ones?• How many algorithms have to be used?
IPCV 2006 Budapest
Concepts (1)
• Use legacy DBMS’s.• Model the images with OO.• Objects have representation, features and
matching algorithms.• Use many matching algorithms and not only
one!• Searching must be based on image parts
where many parts of different images with different methods should be matched.
IPCV 2006 Budapest
Concepts (2)
• An image must have many different feature vectors to be stored.
• Multimedia indexing techniques have to be used.
• Motives have to be extracted, stored and used in the retrieval.
• Use a class hierarchy to classify the motives. The hierarchy is built on the existing index structure as a secondary semantical index.
• Use textual information in the retrieval.
IPCV 2006 Budapest
Using OO technology
• The image is an object• It stores the features • It stores the management algorithms• It stores the matching algorithms
• You can use more algorithms• It has an inheritance tree
• Indexing and polimorphism
IPCV 2006 Budapest
• Simple inheritance, special matching
Using the Inheritance Tree
IPCV 2006 Budapest
Image
Colored BW
LandscapePortrait
• Typified query, hierarchical indexing
Type Tree
Image
Colored BW
Portrait Landscape
IPCV 2006 Budapest
Polimorphism
• Matching polimorphism• Each subclass inherits the matchings of
superclass and can specialize them• Vector polimorphism
• Algorithms can use different vectors• The vectors are inherited too, (as well
as the extraction), and they can be specialized
IPCV 2006 Budapest
Object indexing (1)
• Every node in the type tree represents a table
• Every table contains pointers (references)• The pointers (references) refers the
instances of the particular class and its subclasses
IPCV 2006 Budapest
Object indexing (2)
• Every table can be indexed by a data- or space partitioning technique
• The indices indexes the images referred by the table content
• Thus we have a multilevel hierarchycal index tree
• It is associative (semantical) depending on the type tree
IPCV 2006 Budapest
Object indexing (3)
• The class hierarchy C contains classes Cj, j = 1,…,N
• For every object O in the database there exists a class Cj that O Cj ( j { 1, N } )
• Notation:• C1 → C2 : C1 is the parent of C2
• Cp < Cq : C1,…,Cn, Cp = C1, Cq = Cn, Ci → Ci+1, i = 1,…,n-1 (Cq is a descendant of Cp)
IPCV 2006 Budapest
Object indexing (4)
• Objects can be indexed if
CkCi, i k, Ck < Ci, and Cj, Cj < Ck, i,j,k { 1,…,N }– There exists root
CiCk, i,k { 1,…,N }, i k, Ci isn’t root,Ck → Ci, and Cl, if Cl → Ci, then Cl = Ck – There is only one parent for a node, except the
root
IPCV 2006 Budapest
Typified Query (1)
• The query image type has to be identified• The type allocates a type tree nod• The node has an assigned matching
algorithm and a vector type• The node refers a table of image pointers• The table’s images are indexed by an
indexing algorithm
IPCV 2006 Budapest
• Extract the feature vector of the query image
• The features of the images referred by the node’s table and the query features have to be compared by the identified matching algorithm
• The resulted pointers locates the result image set
• If no result, then the algorithm has to be iterated torwards the parents (super class)
IPCV 2006 Budapest
Typified Query (2)
• Identical typified query:
• Epsilon typified query:
• NN typified query:
IPCV 2006 Budapest
Typified Query (3)
Advantages of Using OO
• Facilities of OODBMS and ORDBMS instead of RDBMS
• Extentionality (inheritance)• Uniform handling (Interface)• Specialization (matching and vector
polimorphism)• Multilevel, hierarchycal indexing• Typified query
IPCV 2006 Budapest
Existing General Systems
• Oracle– Oracle 8.1.7 with VIR cartridge– Oracle 9i R1 and R2 with standard feature– Oracle 10g with Still Image support
• IBM DB 2– With image extenders QBIC (fuddy-duddy)
IPCV 2006 Budapest
Features of IBM DB2
• Average color
• Color histogram
• Positional color
• Texture
IPCV 2006 Budapest
Features of Oracle 9
• Colour
• Shape
• Texture
• All of them with location (as a new value)
IPCV 2006 Budapest
Score of Oracle 9• The result is the weighted sum of the
above mentioned four matching value between 0 and 100
d = w1dc + w2ds + w3dt + w4dl
The weights are normalised that
w1 + w2 + w3 + w4 = 100
IPCV 2006 Budapest
Oracle 9 Realisation
Schema: ORDSYSUsed for handling the types and routines
Objects: ORDImage, ORDImageSignatureCan be used in tables for storing
Methods: generateSignature, evaluateScoreCan be used in PL/SQL routines for retrieval and
management
Relational interface: isSimilarCan be used in SELECT statements for retrieval
IPCV 2006 Budapest