Date post: | 02-Dec-2014 |
Category: |
Education |
Upload: | jonathon-hare |
View: | 755 times |
Download: | 1 times |
A BRIEF INTRODUCTION TO EXTRACTING INFORMATION
FROM IMAGES
Jonathon HareUniversity of Southampton
• What can images tell us?• How are images represented in digital
computers• How do we extract information from
images– Examples of some different extraction
techniques– Analogies with text– Free software!
CONTENTS
IMAGES CAN…
the main roles of images in the communications process
ATTRACT ATTENTION AND MAKE DOCUMENTS MORE
APPEALING
CONVEY OPINIONS AND EMOTIONAL MESSAGES
CONVEY INFORMATION FOR DOCUMENTING A CLAIM
REPRESENTATION AND UNDERSTANDINGhow a computer “sees”
DIGITAL IMAGE REPRESENTATION
87 91 85 ... 86 86 81 ... 88 85 84 ...... ... ... ...137 145 144 ...153 150 137 ...148 139 123 ...... ... ... ...
89 91 89 ... 84 88 90 ... 88 87 90 ...... ... ... ...
UNDERSTANDING AN IMAGE
FEATURE EXTRACTION
f(x)
Feature extraction is the process of extracting “descriptors” from an image. Descriptors describe some aspect of the image content.Typically, a descriptor is a numerical vector called a “feature vector”, however other forms of descriptor are possible.
• Higher-level features– Directly interpretable by humans
• i.e. the number of faces in the image
– Either hand-crafted or trained with machine learning techniques
• Lower-level features– Much more abstract; convey a notion of the
image content• i.e. the colour distribution of the image
IMAGE FEATURE MORPHOLOGY
EXAMPLE HIGH-LEVEL FEATURES
faces, composition & photoshop disasters
• The detection of faces in an image is a very useful feature for inferring information about an image– Face detection is the first step of face
recognition• The most popular face detection
algorithm is the “Viola-Jones” detector– Conceptually simple– Uses machine learning; Requires training
(slow).– Very fast detection
HIGH-LEVEL FEATURES: FACE DETECTION
VIOLA-JONES FACE DETECTION
P. Viola, M. Jones, Robust Real-Time Face Detection, IJCV, Vol. 57(2), 2004. (first version appeared at CVPR 2001)
Bank of filters. Consider all possible position, scale and type parameters(very large numbers of features)
For each feature create a simple (weak) binary classifier (a stump)
Use ADABOOST to select the informative features
VIOLA-JONES FACE DETECTION
P. Viola, M. Jones, Robust Real-Time Face Detection, IJCV, Vol. 57(2), 2004. (first version appeared at CVPR 2001)
• Photographers use the “rule-of-thirds” to improve the composition of their photos.– The basic idea is to place main subjects at
roughly one-third of the horizontal or vertical dimension of the photograph.
HIGH-LEVEL FEATURES: COMPOSITION
It is possible to design features that look for the presence of composition using the rule-of thirds
HIGH-LEVEL FEATURES: COMPOSITION
image saliency map segments + saliency map
distance to closest power-point
area of segment * saliency of segment
Che-Hua Yeh, Yuan-Chen Ho, Brian A. Barsky, and Ming Ouhyoung. "Personalized Photograph Ranking and Selection System". In ACM Multimedia 2010, pages 211–220, October 2010.
HIGH-LEVEL FEATURES: TAMPERING
HIGH-LEVEL FEATURES: TAMPERING
HIGH-LEVEL FEATURES: TAMPERING
A Political Advertisement for George W. BushAutomatic cloning detection (“copy-move” forgery)
EXAMPLE LOW-LEVEL FEATURES
colour histograms, segments and sift
• Global features describe the content of an entire image– One of the simplest global
features is the “Global RGB Colour Histogram”
• Quantise each pixel into a discrete number of colours and then build a histogram.
LOW-LEVEL FEATURES: GLOBAL
• Global features are useful for some tasks, but in many cases are not powerful enough
• Local features attempt to overcome this by breaking the image into smaller parts from which to extract features– Three primary techniques for splitting up the image
LOW-LEVEL FEATURES: LOCAL
segmentation salient regions &interest points
grids & blocks
• Salient interest regions and their associated features are currently the most popular way of describing an image content.
• Extracting image features using interest regions is a two-part process:– Find regions– Extract feature to describe region properties
• Typically, the resultant image feature will have a variable length, dependent on the number of regions
SALIENT INTEREST REGIONS
• Important regions portray:– Repeatability– Saliency
• Corners and blobs have these qualities
• Detectable using various techniques– Difference of Gaussian - corners– Harris corner detector - corners– MSER - blobs
SALIENT INTEREST REGION LOCATION
corners
blobs
• Good region descriptors portray:– Resilience to image transforms– Compactness
• Emphasise different image characteristics:– Pixel intensities, colour, texture, edges etc.
• Common descriptors include:– SIFT: histogram of edge orientation– Shape context: histogram of edge location
SALIENT INTEREST REGION DESCRIPTORS
SIFT: SCALE INVARIANT FEATURE TRANSFORM
ANALOGIES WITH TEXT
introducing the visual bag-of-words
In the computer vision community over recent years it has become popular to model the content of an image in a similar way to a “bag-of-terms” in textual document analysis.
BAGS OF VISUAL WORDS
• Features localised by a robust region detector and described by a local descriptor such as SIFT.
• A vocabulary of exemplar feature-vectors is learnt.– Traditionally through k-
means clustering.
• Local descriptors can then be quantised to discrete visual terms by finding the closest exemplar in the vocabulary.
BOVW USING LOCAL FEATURES
• BOVW models have many applications– Auto-annotation and object recognition
– Concept classification
– Large-scale indexing
APPLICATIONS OF BOVW
OPEN-SOURCE TOOLS FOR IMAGE ANALYSIS AND
INDEXINGintroducing openimaj & imageterrier
• Open-source (BSD Licence) libraries and tools for multimedia (image, video, sound) analysis and information extraction
• Implemented in Java; use with any JVM language– Implementations of all the techniques mentioned in this
tutorial– Scalability of extraction using Hadoop with the included tools
http://www.openimaj.org
• Extension to the Terrier retrieval system to allow indexing of images– Collections and documents that read data produced from
image feature extractors.– New indexers and supporting classes to make compressed
augmented inverted indices for visual term data.– New distance measures implemented as WeightingModels.– Geometric re-ranking implemented as
DocumentScoreModifiers.– Command-line tools for indexing and searching.
• Freely available under the Mozilla Licence
http://www.imageterrier.org