Seminar Report On
Sixth Sense Technology
Gokaraju Rangaraju Institute of Engineering and Technology
Hyderabad
Department of Computer Science Engineering
By,
Contents
1 Introduction
1.1 Organization Of the Report . . . . . . . . . . . . . . . . . . . . . . . .
2 Gesture Recognition
2.1 Motion Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Skin Color Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Geometric Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Motion Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Recognizing Motion patterns using Time Delay neural Network(TDNN)
3 Applications
4 Components in sixth sense technology
4.1 Kinds of gestures recognized . . . . . . . . . . . . . . . . . . . . .
5 Working
6 Merits and Demerits
6.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Concerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 Related Technologies
7.1 Gesture Recognition . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8 Conclusion
Bibliography
Abstract
Information is everything in today's world. Yet the world of information is
very small. From details about the various stars or galaxies to the description
of the tinniest piece of junk that you can get in a supermarket, everything is
present on the internet. Yet it is concerned, trapped in a screen on a desktop
or on a mobile. Well Sixthsense breaks these concerns and brings the
information to the real world. Rather than adjusting ourselves to the latest
machine (gadgets), Sixthsense adjusts the machine to us and trains it to
understand our natural hand gestures.
Sixth Sense is a wearable gestural interface that
augments the physical world around us with digital information and lets us
use natural hand gestures to interact with that information. Steve Mann is
considered as the father of Sixth Sense Technology who made wearable
computer in 1990. He implemented the Sixth Sense Technology as the neck
worn projector with a camera system (which Mann originally referred to as
“Synthetic Synesthesia of the Sixth Sense”). He was a media lab student at
that time. Then his work was carried forward by Pranav Mistry (Ph.D student
in the Fluid Interfaces Group at the MIT Media Lab). By using a camera and
a tiny projector, Sixth Sense sees what you see and visually augments any
surfaces or objects we are interacting with. It projects information onto
surfaces, walls and physical objects around us, and lets us interact with the
projected information through natural hand gestures, arm movements, or our
interaction.
Chapter 1
Introduction
Steve Mann was the first person to bring forth the idea of Sixthsense in the form of a
device called Telepointer; it was originally referred to as 'Synthetic Synesthesia of the
Sixth Sense'. He is also considered as the father of Sixthsense. This was later developed
by Pranav Mistry, a PhD student at MIT.
'Sixthsense' is a wearable gestural interface that augments the physical world around
us with digital information and lets us use natural hand gestures to interact with that
information.
Hardware requirements for the device include :-
Camera- recognizing and tracking hand gestures
Pocket Projector- projects digital information on the wall or any other surface
Cellphone- connects to the cloud and does the processing
Mirror- helps in projecting the image on a horizontal surface by reaction Coloured
Caps- helps in keeping track of the hand and recognizing gestures.
The technologies used in this project are :-
Hand Augmented Reality
Gesture Recognition
Image capturing, processing and manipulation
1.1 Organization Of the Report
1. Chapter 2 describes how gesture recognition is done and it's different
phases
Figure 1.1: Sixthsense device
2. Chapter 3 describes the applications of the device
3. Chapter 4 describes how Components of the device
4. Chapter 5 describes the working of the device
5. Chapter 6 describes the merits and demerits of the device.
6. Chapter 7 discusses the Related Technologies
7. Chapter 8 gives the conclusion of the paper.
Chapter 2
Gesture Recognition
Gesture recognition is an important aspect of this device. We would be
tackling the general gesture recognition technique(not specific to
Sixthsense). The entire process of gesture recognition can be divided into the
following parts :-
Motion Segmentation
Skin Color Model
Geometric Analysis
Motion Trajectories
Recognizing Motion Patterns Using Time Delay Neural Network(TDNN)
2.1 Motion Segmentation
Early methods can be divided into two types:- 1. Pixel based
Motion segmentation in this method is done on the basis of intensity
variations i.e. it regards intensity variations as a cause for motion and
vice versa. This method works well when we consider a scene with
slow moving objects or in which the number of objects are less. But as
the number of objects or speed of objects increases the performance
falls.
2. Feature based
Feature based method matches image features like points de ned by
local intensity, edges, corners, etc. Features are extracted using single
scale segmentation. These features are then matched across frames.
Segmentation errors might make it difficult to find feature
correspondence across frames.
In motion segmentation region primitives are used to find the 2D
motion field. It performs well in situation where pixel based methods fail.
The major advantage of using this method is that it is multistate therefore
providing a rich description of region primitives. Also the region primitives
are not effected by noise or illumination changes. Region correspondence
is found by matching region shape, intensity and size into account whereas
previous algorithms used a much simpler approach.
For a pair of frames, (It,It+1), the algorithm identifies regions in each
frame com-pricing the multistate intraframe structure. Regions at each
scale are then matched across frames. Motion segmentation generates
regions that have uniform motion.
2.2 Skin Color Model
Not all of the regions generated by the motion segmentation is of use to us.
In fact most of the regions can be easily discarded based on certain criteria.
The criteria might defer based on the implementation of the particular
project. Here skin color is used as a cue for selecting the regions that are
useful. The rest of the regions can be discarded.
Unlike other methods, here skin color is not used for motion
segmentation but to select the motion fields generated after motion
segmentation process. In case of SixthSense device this selection criteria is
narrowed further by the use of coloured caps on fingers. Therefore it is a
generic method and the criteria for selecting the motion fields can be
selected by the individual based on the purpose for which it is being
designed.
Human skin color has been used and proven to be an effective feature in
numerous applications. We use a Gaussian mixture to model skin color
distribution in CIE LUV color space from a database of 2,447 images
which consists of faces of different ethnic groups. The luminance value of
each pixel is discarded to minimize the effects of lighting conditions and
the parameters of the Gaussian mixture are estimated.
A motion region is classified to be skin tone if the probabilities of being
skin color of most pixels were above a threshold.
2.3 Geometric Analysis
Since the shapes of human head and hands can be approximated by
ellipses, motion regions of skin tone were merged until the shape of the
merged region is approximately elliptical. Skin tone regions were sorted
based on their size.
Among the largest regions, a region was randomly selected as a seed. A
neighbor of the selected region was iteratively merged if the goodness t of
an elliptic function of the resulting region did not decrease by a threshold.
This iteration proceeded till the number of grouped regions did not exceed
a preselected maximum value. The whole process repeated several times
with diferent random seeds to generate multiple candidates and the largest
merged region was selected. The orientation of an ellipse was estimated
from the axes of the least moment of inertia.
2.4 Motion Trajectories
Although motion segmentation captures motion details by matching
regions at ne scales, it is sufficient to use coarser motion trajectories of
identified hand regions for gesture recognition. A ne transformation of a
hand region in each frame pair is computed using the following formula
Figure 2.1: A fine Transformation
The a fine transformations of successive pairs are then concatenated to
construct motion trajectories of the hand region.
2.5 Recognizing Motion patterns using Time De-lay neural
Network(TDNN)
TDNN is a dynamic classification approach in that the network sees only a
small window of the input motion pattern and this window slides over the
input data while the network makes a series of local decisions. The local
decisions taken are later integrated into a global decision.
Chapter 3
Applications
The applications of SixthSense technology are numerous Hardware
requirements for the device include :-
1. 3D Drawing application lets the user to draw on any surface by tracking
the fingertip movements of the user's index finger.
2. Mapping :- Navigate map using hand gestures, zoom in, zoom out, pan,
magnifying a certain portion of the map, etc.
Figure 3.1: Viewing map using SixthSense device
3.GESTURE ANALYSIS :- Multi touch gestures, iconic gestures,
freehand gestures. Common gestures recognized are drawing a circle on
wrist to show the clock, making a square using hands to capture a photo,
drawing an @ symbol to check emails.
Figure 3.2: Gesture for recognising time
Figure 3.3: Gesture for taking a photo
4. Touch Sensation :- Microphone attached to paper receptive to touch,
camera tracks movement of finger
5. Make a call
Figure 3.4: Making a call using SixthSense device
6. Get product information
7. Create multimedia reading experience in newspaper
Figure 3.5: Multimedia Reading Experience
8. Get reviews of book
9. Get dynamic updates
10. Feed information on people
11. Information acquirement through things we carry- TaPuMa
CHAPTER-4
COMPONENTS The devices which are used in Sixth Sense Technology are: 2.5 Camera. 2.6 Coloured Marker.
2.7 Mobile Component. 2.8 Projector.
2.9 Mirror.
1.Camera: It captures the image of the object in view and tracks the user‟s hand
gesture. The camera recognizes individuals, images, pictures, gestures that
user makes with his hand. The camera then sends this data to a smartphone
for processing. Basically the camera forms a digital eye which connects to
the world of digital information.
2. Coloured Marker: There are color markers placed at the tip of users fingers. Marking the user‟s
fingers with red, yellow, green and blue coloured tape helps the webcam to
recognize the hand gestures. The movements and arrangement of these
markers are interpreted into gestures that act as a interaction instruction for
the projected application interfaces.
3. Mobile Component: The Sixth Sense device consists of a web enabled smartphone which process
the data send by the camera. The smartphone searches the web and interprets
the hand gestures with the help of the coloured markers placed at the finger
tips. Basic processing works on computer vision algorithms where approx.
50,000 lines of code are used written in Symbian C++.
4. Projector: The information that is interpreted through the smartphone can be projected
into any surface. The projector projects the visual information enabling
surfaces and physical objects to be used as interfaces. The projector itself
consists of a battery which have 3 hours of battery life. A Tiny LED
projector displays the data sent from the smartphone on any surface in view-
object, wall or person. The downward facing projector projects the image on
to a mirror. 5. Mirror: The usage of a mirror is important as the projector dangles pointing
downward from the neck. The mirror reflects the image on to a desire surface.
Thus finally the digital image is freed from its confines and placed in the
physical world.
4.1:KINDS OF GESTURES RECOGNIZED The software recognizes three kinds of gestures: 1.Multi-Touch Gestures: Like the ones we see in the iPhone-where we touch the screen and make the
map move by pinching and dragging.
2. Freehand Gestures: Like when you take a picture or Namaste gesture to start the projection on
the wall.
3. Iconic Gestures: Drawing an icon in the air. Like, whenever we draw a star, shows us the
weather details. When we draw a magnifying glass, shows us the map.
CHAPTER-5
Working:
The Sixth Sense Technology works as follows: 1.It captures the image of the object in view and track the user‟s hand gestures.
5.2 There are color markers placed at the tip of users finger. Marking the
user‟s fingers with red, yellow, green and blue colored tape helps the webcam
to recognize the hand gestures. The movements and arrangement of these
markers are interpreted into gestures that act as a interaction instruction for
the projected application interfaces.
5.3 The smartphone searches the web and interprets the hand gestures with
the help of the colored markers placed at the finger tips.
5.4 4.The information that is interpreted through the smartphone can be
projected into any surface.
5.5 The mirror reflects the image on to a desired surface.
CHAPTER-6
MERITS & CONCERNS 6.1 ADVANTAGES:
1.Portable: One of the main advantages of the Sixth Sense devices is its small size and
portability. It can be easily carried around without any difficulty. The
prototype of the Sixth Sense is designed in such a way that it gives more
importance to the portability factor. All the devices are light in weight and the
smartphone can easily fit into the user‟s pocket. 2. Support Multi touch and Multi user interaction: Multi touch and Multi user interaction is another added feature of the Sixth
Sense devices. Multi sensing technique allows the user to interact with
system with more than one finger at a time. Sixth Sense devices also in-
corporates Multi user functionality. This is typically useful for large
interaction scenarios such as interactive table tops and walls. 3. Cost Effective: The cost incurred for the construction of the Sixth Sense prototype is quite
low. It was made from parts collected together from common devices. And a
typical Sixth Sense device costs upto $300. The Sixth Sense devices have not
been made in large scale for commercial purpose. Once that happens it‟s
almost certain that the device will cost much lower than the current price. 4. Data access directly from the machines in real time: With the help of a Sixth Sense device the user can easily access data from
any machine at real time speed. The user doesn‟t require any machine-human
interface to access the data. The data access through recognition of hand
gestures is much easier and user friendlier compared to the text user interface
or graphical user interface which requires keyboard or mouse. 5. Mind map the idea anywhere: With the advent of the Sixth Sense device, requirement of a platform or a
screen to analyze and interpret the data has become obsolete. We can project
the information onto any surface and can work and manage the data as per our
convenience. 6. Open Source Software: The software that is used to interpret and analysis the data collected by the
device is going to be made open source as said by its inventor. This will
enable other developers to contribute to the development of the system.
6.2: CONCERNS There is a health issue regarding Sixth Sense‟s projection technology. When the
device is projecting on a hard surface, it is not private enough just for the user.
People around him can see the projection that is very detailed. Projection is
better in the night time and dark areas rather than mornings and bright areas.
This is an issue because the vision of the user can be damaged. Sixth Sense
should be able to shift its projection techniques during different times of the day.
That way it won‟t be an issue for the vision of the user. Since the device is still
being modified and tested, Mistry can try to overcome concerns with related
technologies.
Chapter-7: RELATED TECHNOLOGIES
7.1 Gesture recognition:
It is a technology which is aimed at interpreting human gestures with the help
of mathematical algorithms. Gesture recognition technique basically focuses
on the emotion recognition from the face and hand gesture recognition.
Gesture recognition technique enables humans to interact with computers in a
more direct way without using any external interfacing devices. It can provide
a much better alternative to text user interfaces and graphical user interface
which requires the need of a keyboard or mouse to interact with the computer.
Interfaces which solely depends on the gestures requires precise hand posing
tracking. In the Sixth Sense devices coloured vands are used for this purpose.
Once hand pose has been captured the gesture‟s can be recognized using
different technique‟s. Neural network approaches or statistical templates are
the common techniques used for the recognition purposes. This technique
have an accuracy of more than 95%. Time dependent neural network will also
be used for real time recognition of the gestures.
7.2 Augmented reality:
The Augmented reality is a visualization technique that allows the user to
experience the virtual experience added over real world in real time. With the
help of advanced AR technology the information about the surrounding real
world of the user becomes interactive and digitally usable. Artificial information
about the environment and the objects in it can be stored and retrieved as an
information layer on top of the real world view. When we compare the spectrum
between virtual reality, which creates immersive, computer-generated
environments, and the real world, augmented reality is closer to the real world.
Augmented reality adds graphics, sounds, haptic
feedback and smell to the natural world as it exists. Both video games and cell
phones are driving the development of augmented reality. The augmented
systems will also superimpose graphics for every perspective available and try
adjust to every movement of the user‟s head and eyes. The three basic
components of na augmented reality system are the head mounted display,
tracking system and mobile computer for the hardware. The main goal of this
technology is to merge these three components into a highly portable unit. The
head mounted display used in augmented reality systems will enable the user to
view superimposed graphics and text created by the system. Another component
of augmented reality system is its tracking and orientation system. This system
pinpoints the user‟s location in reference to his surroundings and additionally
tracks the user‟s eye and head movements.
7.3 Computer vision: Computer vision is the technology in which machines are able to
interpret/extract necessary information from an image. Computer vision
technology includes various fields like image processing, image analysis and
machine vision. It includes certain aspect of artificial intelligence techniques
like pattern recognition. The machines which implement computer vision
techniques requires image sensors which detect electromagnetic radiation which
are usually in the form ultraviolet rays or light rays. The computer vision find
itself applicable in various fields of interest. One such field is bio medical image
processing. It‟s also used in autonomous vehicles like SUV‟s. The computer
vision technique basically includes four processes.
Chapter 8: Conclusion
Sixthsense technology is currently in it's early stages and is
regarded as a very powerful project by many experts. The future for this
project is very promising and we'll be surely seeing it's applications all around
the world very soon. Pranav Mistry has indicated that he'll making the project
code open pretty soon. This will give a major boost to the development of this
project. Recently Pranav Mistry used sixth sense technology to implement a
mouse without using the actual device using lasers and gesture recognition.
Bibliography:
[1] Integrating Information with the Real World Using Sixth Sense Computing
by 1Sanjeev Tayal*, 2Pramod Kr. and 3Monika Garg VSRD-IJCSIT, Vol. 2 (2), 2012, 137-
145 http://www.vsrdjournals.com/CSIT/Issue/2012_02_Feb/Web/8_Sanjeev_Tayal_588_Research_Communication_Feb_2012.pdf
[2] sixthsense, integrating information with the real world by Pranav Mistry (inventor of
sixthsence technology) http://www.pranavmistry.com/projects/sixthsense/
[3] Complete Guide of Making Sixthsense Device by Minhal Mehdi march
2012 http://www.techexperiments.in/2012/03/complete-guide-of-making-sixthsense.html
[4] MIT Wearable Gadget Gives You Sixth Sense by Diann Daniel, CIO Apr 14, 2009 http://www.pcworld.com/article/163072/wearable_gadget.html
[5] „Sixthsense‟ Technology for Lifelong Learning: Unlashing Unlimited Potentiality
by Sayantan Mandal ECER 2012 http://www.eera-ecer.de/ecer-
programmes/conference/6/contribution/16124/
[6] At TED, Virtual Worlds Collide With Reality
http://pogue.blogs.nytimes.com/2009/02/11/at-ted-virtual-worlds-collide-with-reality/
[7] Ming-Hsuan Yang, Member, IEEE, Narendra Ahuja, Fellow, IEEE, and Mark Tabb,
Member, IEEE, Extraction of 2D Motion Trajectories and Its Application to Hand Gesture
Recognition
[8] Pranav Mistry, Tsuyoshi Kuroki, Chaochi Chang, MIT Media Laboratory, TaPuMa:
Tangible Public Map for Information Acquirement through the Things We Carry.