Indoor Scene Classification
Anuja RanjanManav Garg
Prof: Dr. Amitabha Mukherjee
Problem• Classifying indoor scenes is a challenging task due to the large variation across different examples within each class and similarities between different classes.
• Besides spacial properties indoor classification requires us to see the objects theycontain for a good accuracy.
Related Work
In a recent work by Toralba and Quattoni region of interests are extracted from images and compared. They do not use objects in their approach but do mention that some indoor scenes are better classified by the objects they contain.
Img source: Ariadna Quattoni and Antonio Torralba: ”Recognizing Indoor Scenes,”IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009.
Towards classification via objects• Training Phase : Extracting features and building object classifiers Extracted HOG features and gabor features for images of each of the 4 object
classes. Images obtained from Caltech Dataset and Google Images
Use AdaBoost algorithm to combine these weak classifiers to develop a strong classifier.
• Testing Phase : Detecting objects in test image and predicting the scene probability
Used sliding window method with 5 shapes to detect objects in the scene. The method is additionally improved by the use of 3D features of image. With the detections, confidence files and the prior probabilities of objects in the
scene we classify the image into scenes.
Results
Visual Prob: 0.786 3D Prob: 0.915
Office: 68.26Hall: 14.31Conference: 17.43
Office: 37.33Hall: 30.33Conference: 32.34
Office: 18.53Hall: 22.84Conference: 58.61
Ref: P. Espinace, T. Kollar, A. Soto, and N. Roy: ”Indoor scene recognition through object detection,” IEEE International Conference on Robotics and Automation (ICRA), 2010.
Improving the work• Use Gist Features and SVM to develop object classifiers.
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International Journal of Computer Vision, 42, 145-175.
Improving the work• Use Gist Features and SVM to develop object classifiers.
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International Journal of Computer Vision, 42, 145-175.
WHY GIST???WHAT’S NEW???
• Use Gist Features and SVM to develop object classifiers.
• Gist features develop low dimensional representation of a scene which doesn’t require segmentation or object recognition but use perceptual features.
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International Journal of Computer Vision, 42, 145-175.
Improving the work
WHY GIST???WHAT’S NEW???
• Use Gist Features and SVM to develop object classifiers.
• Gist features develop low dimensional representation of a scene which doesn’t require segmentation or object recognition but use perceptual features.
• The result obtained by Gist classification show a considerable improvement in the context based recognition and classification.
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International Journal of Computer Vision, 42, 145-175.
Improving the work
WHY GIST???WHAT’S NEW???
• Use Gist Features and SVM to develop object classifiers.
• Gist features develop low dimensional representation of a scene which doesn’t require segmentation or object recognition but use perceptual features.
• The result obtained by Gist classification show a considerable improvement in the context based recognition and classification.
• They haven’t yet been used as such in this classification problem.
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International Journal of Computer Vision, 42, 145-175.
Improving the work
WHY GIST???WHAT’S NEW???
Comparison of Object Classifiers• Average overall accuracy achieved by the HOG classifier ~ 62%• Our classifier ~ 75%
Comparison of Object Classifiers• Average overall accuracy achieved by the HOG classifier ~ 62%• Our classifier ~ 75%
Monitor:0.786
Monitor: 0.15Screen: 0.028
Monitor:0.65
Monitor: 0.052Screen: 0.621
Gist features:
HOG features:
Appendix
They record the occurrence of gradients in localized area of image.Uses the fact that an object can be described by the distribution of its intensity
gradient.
Gabor features• Gabor filters are used for edge detection ,texture representation,etc.• Filters with different frequencies and orientations are used for feature extraction.• 2D Gabor filter is a Gaussian kernel function.• Also used for sparse object representation
SR Ranger Method• The dataset consists of pair of visual and 3D images. The depth in 3D image is
estimated per pixel from measuring the time difference between the signal sent and received.
• 3D models are used to compute the probability of matching the geometric properties of a given object and the information present in a given window (Size, Height) to improve object classification.
• This takes into account the spatial properties.• In order to find an object the probability of visual and 3D image both should be
above a threshold.