Learning Models for Object Recognition from Natural Language Descriptions Presenters: Sagardeep...

Learning Models for Object Recognitionfrom Natural Language Descriptions

Presenters:Sagardeep Mahapatra – 108771077Keerti Korrapati - 108694316

.

Goal

• Learning models for visual object recognition from natural language descriptions alone

Why learn model from natural language?

• Manually collecting and labeling large image sets is difficult

• New training set needs to be created for each new category

• Finding images for fined grained object categories is tough• Ex- species of plants and animals• But detailed visual descriptions may be readily available

.

Outline

• Datasets for training and testing

• Natural Language Processing methods

• Template Filling

• Extraction of visual attributes from test images

• Score an image against the learnt template models

• Results

• Observations

.

Dataset

• Text descriptions associated with ten species of butterflies from the eNature guide to construct the template model• Butterflies, because they have distinctive visual features like wing colors, spots, etc

• Images downloaded from google for each of the ten butterfly categories form the testing set

• »

Danaus plexippus Heliconius charitonius Heliconius erato Junonia coenia Lycaena phlaeas

Nymphalis antiopa Papilio cresphontes Pieris rapae Vanessa atalanta Vanessa cardui

.

Natural Language Processing

• Goal: Convert unstructured data in descriptions into structured templates

Factual but unstructured data in text

Information

Extraction

………..…….….………..

.

Template Filling

• Text is tokenized into words

• Tokens are tagged with parts of speech (using C&C tagger)

• Custom transformations are performed to correct known mistakes• Required because eNature guide tends to suppress some information

• Chunks of texts matching pre-defined tag sequence are extracted• Ex- noun phrases (‘wings have blue spots’), adjective phrases (‘wings are black’)

• Extracted phrases are filtered through a list of colors, patterns and positions to fill the template slots

Tokenization Part-of-Speech Tagging

Custom Transformation

Chunking Template Filling

Visual Processing

Performed based on two attributes of butterflies

• Dominant Wing Color• Colored Spots

1) Image Segmentation

• Variation in the background can pose challenges during image classification

• Hence, the butterfly image was segmented from the background using the ‘star shape’ graph cut approach

2) Spot Detection (Using a spot classifier)

• Hand marked butterfly images with no prior class information form the training set for the spot classifier

• Candidate regions likely to be spots are extracted by using Difference-of-Gaussians interest point operator

• Image descriptors (SIFT features) are extracted around the candidate spot to classify it as a spot or non-spot

3) Color Modelling

• Required to connect color names of dominant wing colors and spot colors in learnt templates to image observations

• For each color name ci, probability distribution p(z|ci) was learnt from training butterfly images ,where z is a pixel color observation in the L*a*b* color space

Generative Model

Given an input image I

the probability of the image given a butterfly category Bi as a product over the spot and wing observations:

Spot color name prior Equal priors to all spot colors

Dominant color name prior

.

Experimental Results

Two set of experiments were performed

• Performance of human beings in recognizing butterflies from textual descriptions• Because this may be reasonably considered as an upper bound

• Performance of the proposed method

Human Performance

Performance of proposed method

Observations

• Accuracy of proposed method was comparable to accuracy of non-native English speakers

• Accuracy of proposed method was more than 80 percent for four categories

• Classification of ‘Heliconius charitonius’ was the toughest for humans and also with the ground-truth and learnt templates

• Performance with ground-truth templates was comparable to that with the learnt templates• Errors in templates due to NLP methods did not have much impact

Thank You

Date post:	12-Jan-2016
Category:	Documents
Upload:	elwin-walker
View:	216 times
Download:	0 times

Learning Models for Object Recognition from Natural Language Descriptions Presenters: Sagardeep...

Documents