Date post: | 17-Jan-2016 |
Category: |
Documents |
Upload: | eric-parrish |
View: | 212 times |
Download: | 0 times |
MMDB-6 J. Teuhola 2012 1
6. Image databases
Image representations: Digitized (sampled) representation of field-based spatial data ‘Raw’ images digital images bitmapped images raster
images m n matrix of pixels, resolution = sampling rate, pixels per inch Each pixel represented by k bits (= accuracy = color depth);
2k possible values.
Image types: Binary (bi-level) images (black = 0, white = 1; e.g. telefax) Grey-scale images (usually k = 8; enables 256 grey-levels) Color images (various representations)
Sources: Devices: scanner, digital camera, electron microscope, medical
imaging devices (PET, MRI) Wavelengths: visible light, infrared, X-rays
MMDB-6 J. Teuhola 2012 2
Color images
‘True’ color schemes: Three components per pixel (possibly 4th for -channel =
transparency) RGB = Red - Green - Blue (typically 3 x 8 = 24 bits per pixel) CMY = Cyan - Magenta - Yellow (CMYK used in printing, K = black) HSI = Hue - Saturation - Intensity (used in image processing) YUV YCbCr = Luminance (brightness) + 2 x chrominance (color)
Used in image compression (JPEG) Correlations between color components reduced. Most information is collected in the Y-component.
Indexed color schemes: Palette of e.g. 256 colors Mapping table from color indices to RGB-values Saves space, sufficient for many applications
MMDB-6 J. Teuhola 2012 3
Image formats
Tens of formats exist for different environments and applications, e.g.
BMP = Bitmap image file (MS Windows) GIF = Graphics Interchange Format (indexed colors; sincludes
compression, supported by web browsers) JBIG = Joint Bi-level Image experts Group file interchange format JPEG = Joint Photographic Experts Group
(JFIF = JPEG File Interchange Format) JP2 = JPEG 2000 PBM = Portable Bitmap Format (black-and-white) PGM = Portable Greymap Format (grey-scale) PPM = Portable Pixmap Format (color) PNG = Portable Network Graphics TIFF = Tagged Image File Format (large number of options)
MMDB-6 J. Teuhola 2012 4
Image compression
Necessary for large image archives: saves space, reduces transmission time.
Possible due to redundancy in images Several methods specialized for different types of images Image formats with compression:
JPEG, based on cosine transform JPEG 2000, based on wavelet transform GIF, based on LZW string compression PNG, based on LZ77 string compression JBIG (bi-level images), based on prediction by context
MMDB-6 J. Teuhola 2012 5
Compression method characteristics
Lossless / lossy methods: Can the original image be recoveredprecisely or only approximately? E.g. JPEG is typically lossy.
Compression efficiency (bit rate), measured in bits/pixel Speed (separately for compression and decompression) Distortion (for lossy methods):
MAE = Mean Absolute Error MSE = Mean Square of Errors RMS = Root Mean Square error SNR = Signal to Noise Ratio PSNR = Peak Signal to Noise Ratio
Robustness against transmission errors Blockiness, blurring, ... (for lossy methods)
MMDB-6 J. Teuhola 2012 6
Searching from an image database
1. Using a hierarchical classification of images:
The user follows paths in the hierarchy, e.g.
Art works Paintings
France 18th century
2. Search using keywords in metadata
Images can be considered similar to documents with index terms
3. Search by content features
Pattern matching based on similarity with a query image, shape, color distribution, etc.
MMDB-6 J. Teuhola 2012 7
Feature extraction and indexing of images
Extraction of descriptive attributes from images Manually, automatically, or using a hybrid scheme
(automatic segmentation & manual assignment of properties).
Manual indexing: Performed by a ‘knowledge worker’, trained on patterns and
vocabulary of the image database application Multiple indexers: Strict consistency rules, common glossary. Automatic tools may help in pattern recognition. Each interesting object (spatial structure) is presented manually to
the system for indexing, equipped with descriptive attributes. Assistance in selecting index terms: Hierarchical dictionaries,
cross-referencing systems, domain thesaurus. Time-consuming and costly; possibility to community-indexing, cf.
http://gimp-savvy.com/PHOTO-ARCHIVE/
MMDB-6 J. Teuhola 2012 8
Automatic indexing
Specialized for various application domains (document recognition, optical character recognition (OCR), engineering drawings, x-rays, ...)
The system must first ‘learn’ and categorize domain element objects.
A certain amount of uncertainty (fuzziness) must be tolerated. Important area of automatic image analysis and object
recognition:Transformation of paper documents into digital form, and indexing those documents appropriately (so called document imaging digital libraries).
MMDB-6 J. Teuhola 2012 9
Color feature extraction
Usually based on color histograms, i.e. number of pixels of each color (or color component):
Separate histograms can be built for various subregions of the image (e.g. top-left, top-right, middle, ...)
The quantification can be made coarser than 0..255 by grouping adjacent histogram values, in order to reduce the dimensionality of the resulting feature vectors.
0 255
#pixels
RED0 255
#pixels
GREEN0 255
#pixels
BLUE
MMDB-6 J. Teuhola 2012 10
Image segmentation
Detection of interesting regions within images. A segment is a connected region that satisfies a homogeneity
predicate. Basis for subsequent search. One of the most difficult tasks in image processing. Several possible (heuristic) methods.
Connected region: For each pair (x1, y1), (xn, yn) of pixels, there exists a chain of
pixels {(x1, y1), ..., (xn, yn)} in the region such that {(xi, yi), (xi+1, yi+1)} are adjacent for all i.
MMDB-6 J. Teuhola 2012 11
Examples of homogeneity predicates
Binary images: p % of the pixels of the connected region have the same color (black or white)
Classified grey-scales, e.g. 0...9, 10...19, etc.A connected region is homogeneous, if at least p % of its pixelsbelong to the same class.
Dynamic grey-scale classification: Class boundaries are not predefined, but the interval size is: p % of the cells should havea grey-level within units.
Grey-scale images with a reference function f for homogeneity:The number of pixels in { (x, y) | grey-level(x, y) - f(x, y) < }should be at least p % of the pixels in the region.
MMDB-6 J. Teuhola 2012 12
Miscellaneous segmentation techniques
(a) Regular block segmentation: Example: Quadtree or binary tree decomposition until
homogeneous regions are obtained. Does not usually satisfy the maximality condition for segmentation:
Neighboring blocks may constitute a homogeneous region. Generalization of binary tree segmentation: blocks can be split in
any direction: polygon segmentation.Compromise solution: splitlines only in 0, 45, 90, and 135 directions.
(b) Splitting and merging: Augments category (a) methods to satisfy the maximality condition. Merging tests the obtained regions pairwise for homogeneity. Does not usually produce a unique segmentation for an arbitrary
homogeneity predicate.
MMDB-6 J. Teuhola 2012 13
Miscellaneous segmentation techniques (cont.)
(c) Thresholding: Applicable, if objects of interest and the background have
sufficiently distinct grey-level values. The grey-level histogram of the image has two or more peaks,
between which we can choose the threshold grey-level values. Must usually be augmented with more sophisticated techniques.
(d) Region growing: Start from a set of seed points. Include neighboring pixels as long as homogeneity holds. Difficulty: How to choose the seeds?
(e) Edge-following algorithms: Follow a (hopefully circular) path of largest gradients (steepest
slope) around the object to be detected.
MMDB-6 J. Teuhola 2012 14
Example: thresholding
Threshold = 128
MMDB-6 J. Teuhola 2012 15
Examples:
Tolerance = 80 Convolution kernel: 0 -1 0-1 4 -1 0 -1 0
Region growing Edge detection
MMDB-6 J. Teuhola 2012 16
Segment feature extraction from images
Various approaches, e.g. area of the segment eccentricity/circularity shape approximation curvature
Desirable properties of segment features: Invariance to translation Invariance to rotation Invariance to scaling
MMDB-6 J. Teuhola 2012 17
Representing the shape of segments
See: Sven Loncaric: ” A Survey of Shape Analysis Techniques”, Pattern Recognition, 31 (8), pp. 983-1001,1998
Example: boundary scalar transform:
Another possibility: tangent angles at regular intervals Both can be made rotation and scaling invariant. The resulting 1D-function is usually Fourier-transformed:
amplitude (magnitude) values are rotation invariant; phase determines orientation and starting point).
The lower-frequency Fourier coefficients can be used as the feature vector representing the shape. About 20 is often enough.
Technical problems: Non-convex shapes; shapes with holes.
MMDB-6 J. Teuhola 2012 18
Representing the shape of segments (cont.)
Syntactic techniques, e.g. encoding of boundary into symbols representing quantized directions: forward, left, right, forward-left, forward-right, ...
String matching techniques (e.g. longest common subsequence) can beused to measure the distance between shapes
Global scalar transform techniques, e.g. moments:
The (infinite) set of moments contains all the information about the shape;in practice, we take a limited number (say 20) of lower order moments.This is a straightforward technique for feature vector construction for shapes.
,1,0,,),( qpwheredydxyxfyxm qppq
(f,f,f,fr,f,fr,r,f,fr,l,fr,fr,f,r,f)
start
MMDB-6 J. Teuhola 2012 19
Texture feature extraction: some approaches
(a) Pixel neighbourhood features
Measure directly the visual patterns occurring in the texture.
- Color co-occurrence matrix
Probability of co-occurring colors at a given distance & direction.
- Local binary patterns
Classifies pixel neighbourhood distributions at several distances
(b) Transform-based features
Produce coefficients representing weights of spatial frequencies
- Fourier transform Texture frequencies induce large related coefficients - Wavelet transform
Division of the image signal into ’subbands’ by a low-/high-pass filter bank. High-pass coefficients tend to be close to zero. Various approaches and filters have been suggested.
MMDB-6 J. Teuhola 2012 20
Content-based retrieval from image databases
General property: Retrieval is not 100 % precise.
Query types: Find images having certain features, e.g. color, texture, shape, etc. Find images containing certain types of objects. Find image objects having certain attributes, such as shape (circle,
triangle, arc, ...), size, color, etc. Find images where object of type 1 is located left of object of type 2
(= spatial relationship) Similarity search: Find images (segments) similar to a given query
image (segment).Applications: recognition of persons from photographs / fingerprints, recognition of military airplanes, ships, etc.
MMDB-6 J. Teuhola 2012 21
Approaches to similarity-based retrieval
(a) Direct metric approach
A distance function is defined for images (segments)
Task: Find the nearest neighbor (k nearest neighbors) of the query image (segment).
Naive distance functions for m n color images: L1-metric: Sum of pairwise Euclidean distances of RGB pixels L2-metric: Euclidean distance in (m n 3)-dimensional
space. See e.g. http://www.tineye.com
Plenty of computation needed.
MMDB-6 J. Teuhola 2012 22
Approaches to similarity-based retrieval (cont.)
(b) Feature-based metrics
Use feature extraction to reduce dimensionality, e.g. Shapes of segments in the image Color, texture, … features of segments or the whole images Different types of features often combined.
The ‘true’ distance function d of images/segments, and the distance function d’ of extracted feature vectors should satisfy approximately
d(a, b) < d (a, c) d’(a, b) < d’(a, c)
Use indexing to accelerate similarity retrieval: Multidimensional indexing for feature groups Inverted indexing for distinct features Mixture of these
MMDB-6 J. Teuhola 2012 23
Approaches to similarity-based retrieval (cont.)
(c) Transformation approach:
Subsumes the metric approach. Basic idea: Dissimilarity is proportional to the minimum cost of
transforming one image (segment) to the other. Choose the image which is the least dissimilar with the query image. Examples of transformation operators: Translation, rotation, scaling
(reduction /magnification), extension, painting, etc. Each operator has an associated cost function. The total cost of transformation is the sum of elementary costs.
Choose the minimum-cost chain of transformation steps for an image, then the minimum over all images in the target set.
More flexible than the metric approach; ‘users’ can specify their own transformation operators and cost functions.
Metric approach supports better indexing.
MMDB-6 J. Teuhola 2012 24
Example system: QBIC (Query By Image Content)See: http://www.research.ibm.com/topics/popups/deep/manage/html/qbic.html Content-based finding of pictorial info from image & video databases Feature extraction in database loading:
Positional color/texture Object identification: manual/semiautomatic/automatic
segmentation Graphically expressed queries based on:
example images user-constructed sketches and drawings (shape parameters) color (principal color or color histogram) texture patterns (coarseness, contrast, directionality) camera and object motion (in videos)
Distance functions between query and image features Fast searching:
Filtering and indexing Reducing the dimensionality by transforms
MMDB-6 J. Teuhola 2012 25
Image database structures
The storage of the pixel matrix (possibly compressed) is usually sequential, because it spans several disk blocks.
Each image can be considered as a file of its own.
(a) Relational representation Image relation: Image id and image-level (global) properties. Object relation: Objects (segments, rectangles) within images;
extracted manually or automatically.Attributes include: image id, object id, MBR coordinates, features.
Generalization: Probabilistic relations; object x is in image i with probability p.
Queries: Apply ‘normal’ database techniques using feature values in query conditions.
MMDB-6 J. Teuhola 2012 26
Image database structures (cont.)
(b) Spatial representation E.g. using R-trees, R*-trees, etc. Build a single R-tree for all images in the database. A leaf page contains a set of closely-located objects (their MBRs),
with a list of pointers to source images. Each list element contains the additional properties of the object. Separate indexes can be built for other than spatial properties.
General observations: In non-spatial respects, images can usually be treated as
documents, and retrieved using techniques developed for general information retrieval.
Combined usage of spatial and non-spatial criteria in retrieval is achieved simply by combining (union, intersection, etc.) the pointer lists from the related indexes.
MMDB-6 J. Teuhola 2012 27
Scenario for an image database architecture
Indexes for globalproperties of images
Indexes for objectfeatures in images
Spatial index forobjects in images
Oid Iid feat1 ... feat n Coord
Images
Iid prop1 ... propm Ptr