The beginning of gestures based interfaces
2Samstag, 27. April 13
Gesture Recognition
§ 1970 Myron W. Krueger and VideoPlace
http
://w
ww
.inve
ntin
gint
erac
tive.
com
/201
0/03
/22/
myr
on-k
rueg
er/
http
://so
fa23
.net
/inde
x.ph
p?m
=1&
sm=&
t=23
&sp
=18&
spic
=43&
me=
show
%20
all&
s=
3Samstag, 27. April 13
One of the firstprototyped VR Using cameras for recognitionSimple ideas
Gesture Recognition
(Baudel and Beaudouin-Lafon, 1993)
§ 1970 Myron W. Krueger and VideoPlace§ 1993 Charade
4Samstag, 27. April 13
First formal definition of gesturesControl PowerPointdataglove4 line = fingers, 1 line = thumb
Gesture Recognition
§ 1970 Myron W. Krueger and VideoPlace§ 1993 Charade
(Baudel and Beaudouin-Lafon, 1993)
5Samstag, 27. April 13
Selection of gestures
Gesture Recognition
§ 1970 Myron W. Krueger and VideoPlace§ 1993 Charade § 2002 Minority Report
http
://7t
hper
bmm
rblo
g.bl
ogsp
ot.c
h/20
11/0
1/w
illia
m-b
erm
udez
.htm
lht
tp://
thom
aspm
barn
ett.c
om/g
lobl
ogiz
atio
n/20
13/2
/5/ti
mes
-bat
tlela
nd-te
rror
ism
-min
ority
-rep
ort-h
as-fi
nally
-arr
iv.ht
ml
6Samstag, 27. April 13
Hollywood movie from Steven SpielbergRooted in Research from John Underkoffler“like conducting an orchestra”tom cruise
Gesture Recognition
§ 1970 Myron W. Krueger and VideoPlace§ 1993 Charade § 2002 Minority Report§ 2009 Oblong Industries
7Samstag, 27. April 13
Last step in our history of gesture based interfacesCommercial company founded by John Underkofflerdeveloped g-speakIntended for big data analysisRequires specialized applications
Oblong Industries - Demo
http://oblong.com/g-speak/
8Samstag, 27. April 13
Orientation in 3DSelectionSegmentation
Oblong Industries - Demo
http://oblong.com/g-speak/
8Samstag, 27. April 13
Orientation in 3DSelectionSegmentation
Common Factor
http
://w
ww
.5dt
.com
/Dat
aGlo
veIm
ages
.htm
l
9Samstag, 27. April 13
most shown systems have in common: data gloveHand trackingHand reconstructionFeedback
How can we get rid of the Data Glove?
10Samstag, 27. April 13
Free up handsRemove instrumentation
Muscle Computer Interface
§ Hands free gestures while holding an object§ Arm band like design§ Sensing muscle activity
(Saponas et al, 2009)
11Samstag, 27. April 13
Hands freeMuscle sending
Muscle Computer Interface - Technology
http
://pa
inm
d.tv
/wp-
cont
ent/u
ploa
ds/2
011/
04/e
mg-
mus
cle-
conf
igur
atio
n.gi
f(Saponas et al, 2009)
12Samstag, 27. April 13
EMG or Electromyographyprimarily in Medical therapy (muscle function assessment, controlling prosthetics)Action Potential generated by muscle when signal arrives from Motor NeuronInvasively by inserting a needle into the muscleNon invasively by sensing on the skin
Muscle Computer Interface - Technology
http
://w
ww
.em
gsru
s.co
m/g
raph
ics/
emg_
trial
_rec
t_pa
ge.p
ng(Saponas et al, 2009)
13Samstag, 27. April 13
here measured activity6 Different musclesPeaks of action potentials
Muscle Computer Interface - Technology
http
://w
ww
.nat
ure.
com
/gim
o/co
nten
ts/p
t1/fi
g_ta
b/gi
mo3
2_F2
.htm
l
Support Vector
Machine
(Saponas et al, 2009)
§ Root mean square§ Frequency energy§ Phase Coherence
14Samstag, 27. April 13
6 Sensors and 2 ground electrodesFeatures extracted from 31ms sample- Root Mean Square of amplitude per channel and ratio of pair of channels sqrt(1/n * (x1^2 + x2 ^ 2 + ...))- Frequency energy via FFT- Relationship between channelsClassified from SVM into gestures
Support Vector Machines
§ Binary Linear Classifier§ Extended to multiple classes
http
s://e
n.w
ikip
edia
.org
/wik
i/File
:Ker
nel_
Mac
hine
.png
15Samstag, 27. April 13
Function phi transforms feature space, such that it is possible to lay a hyper plain between two classesTry to lay separator such that separation is most clearMultiple classes by (one vs rest) or pairwise (one vs one)
Muscle Computer Interface - Demo
(Saponas et al, 2009)
16Samstag, 27. April 13
Guitar heroinput is sent as soon as user touches both fingers
Muscle Computer Interface - Demo
(Saponas et al, 2009)
16Samstag, 27. April 13
Guitar heroinput is sent as soon as user touches both fingers
Muscle Computer Interface
§ Pro§ No instrumentation of hand§ Hidden near elbow
§ Contra§ Inaccurate compared to some following papers§ Muscle activity required
(Saponas et al, 2009)
17Samstag, 27. April 13
79 % accuracy
Gesture Wrist
§ Hands free gestures§ Embed sensing device in wrist watch§ Feedback on gesture
(Rekimoto, 2001)
18Samstag, 27. April 13
Gesture Wrist - Technology
Receiver ElectrodesAcceleration Sensor Piezo Actuator
Transmitter Electrode
Original wristwatch dial
Receiver electrodes
Transmitterelectrode
Tilt sensor(ADXL202)
Wrist
Piezo-actuator
Figure 2: GestureWrist: Wristband-type input device.
3.2 On-body networkingBased also on capacitive sensing, a technique that trans-
mits data through the human body has been proposed [14,5]. Here, both a transmitter and a receiver are capacitivelycoupled to the human body. When a transmission signal ismodulated by data (by using amplitude shift keying (ASK)or frequency shift keying (FSK)), this affects the modi-fied signal that is received at the receiver side. Using thistechnology, wearable devices can communicate with eachother [14], or they can automatically authenticate digitaldevices that are touched [5]. We also use this technique fordistinguishinga wearer from other people while interactingwith GesturePad.
4 GestureWrist: A wristband-type input de-viceGestureWrist is a wristwatch-type input device that rec-
ognizes human hand gestures by capacitively measuringwrist-shape changes and also measuring forearm move-ments.
Figure 3: Sensing arm-shape change based on capacitivesensing.
Figure 2 shows the current GestureWrist prototype. Thisdevice consists of two input sensors (capacitance and ac-celeration sensors), and one tactile feedback actuator. Theprototype is fabricated by attaching the sensors and actu-ators to a conventional wristwatch. We expect that em-bedding all the sensing elements within the wristwatch andthe wristwatch band is technically possible, so a wearercan use this system in any social situation. Sensed infor-mation is processed at an external signal-processing boardconnected by a cable.4.1 Hand-gesture recognitionGestureWrist recognizes hand gestures by measuring
the changes of the arm shape on the inside of the wrist-band. To do this, a combination of transmitter and receiverelectrodes are attached to the back of the watch dial and in-side of the wristband. As described in the previous section,this combination acts as a capacitance sensor.The principle of gesture sensing is shown in Figure 3.
When a wearer opens and closes his or her hand, the cross-sectional shape of the wrist changes accordingly; partic-ularly, the left and right parts around the forearm sinewslightly bulge or cave in. A transmitter behind the wrist-band dial transmits a square wave signal (at approximately160KHz). This signal goes through the wrist, and is re-ceived by the receiver electrodes on the wristband. Theamplitude of the receiving signal is determined by the ca-pacitance between the transmitter electrode and the wrist,the resistance of the wrist, and the capacitance between thewrist and the receiver electrode. Since the first two valuesare mostly stable, the received signal strength is mainlydetermined by the last parameter (capacitance between thewrist and the receiver).To calibrate the displacement of receiving electrodes,
more than one electrode is installed on the wristband. Thecurrent prototype has three receivers. Each transmitter-receiver pair produces sensed values. The values conform
Receiver ElectrodesAcceleration Sensor Piezo Actuator
Transmitter Electrode
Original wristwatch dial
Receiver electrodes
Transmitterelectrode
Tilt sensor(ADXL202)
Wrist
Piezo-actuator
Figure 2: GestureWrist: Wristband-type input device.
3.2 On-body networkingBased also on capacitive sensing, a technique that trans-
mits data through the human body has been proposed [14,5]. Here, both a transmitter and a receiver are capacitivelycoupled to the human body. When a transmission signal ismodulated by data (by using amplitude shift keying (ASK)or frequency shift keying (FSK)), this affects the modi-fied signal that is received at the receiver side. Using thistechnology, wearable devices can communicate with eachother [14], or they can automatically authenticate digitaldevices that are touched [5]. We also use this technique fordistinguishinga wearer from other people while interactingwith GesturePad.
4 GestureWrist: A wristband-type input de-viceGestureWrist is a wristwatch-type input device that rec-
ognizes human hand gestures by capacitively measuringwrist-shape changes and also measuring forearm move-ments.
Figure 3: Sensing arm-shape change based on capacitivesensing.
Figure 2 shows the current GestureWrist prototype. Thisdevice consists of two input sensors (capacitance and ac-celeration sensors), and one tactile feedback actuator. Theprototype is fabricated by attaching the sensors and actu-ators to a conventional wristwatch. We expect that em-bedding all the sensing elements within the wristwatch andthe wristwatch band is technically possible, so a wearercan use this system in any social situation. Sensed infor-mation is processed at an external signal-processing boardconnected by a cable.4.1 Hand-gesture recognitionGestureWrist recognizes hand gestures by measuring
the changes of the arm shape on the inside of the wrist-band. To do this, a combination of transmitter and receiverelectrodes are attached to the back of the watch dial and in-side of the wristband. As described in the previous section,this combination acts as a capacitance sensor.The principle of gesture sensing is shown in Figure 3.
When a wearer opens and closes his or her hand, the cross-sectional shape of the wrist changes accordingly; partic-ularly, the left and right parts around the forearm sinewslightly bulge or cave in. A transmitter behind the wrist-band dial transmits a square wave signal (at approximately160KHz). This signal goes through the wrist, and is re-ceived by the receiver electrodes on the wristband. Theamplitude of the receiving signal is determined by the ca-pacitance between the transmitter electrode and the wrist,the resistance of the wrist, and the capacitance between thewrist and the receiver electrode. Since the first two valuesare mostly stable, the received signal strength is mainlydetermined by the last parameter (capacitance between thewrist and the receiver).To calibrate the displacement of receiving electrodes,
more than one electrode is installed on the wristband. Thecurrent prototype has three receivers. Each transmitter-receiver pair produces sensed values. The values conform
The first device, GestureWrist, is a wristwatch-type in-put device that recognizes human hand gestures by capac-itively measuring changes in wrist shape. Combined withan acceleration sensor, which is also mounted to the wrist-band, the GestureWrist can be used as a command-inputdevice, with a physical appearance almost identical to to-day’s wristwatches.The latter device, GesturePad, is a layer of sensor elec-
trodes that transforms conventional clothes into interactiondevices, or “interactive clothing”. This module can be at-tached to an area of clothes such as a sleeve or a lapel. Alsoon capacitive sensing, it can detect and read finger motionsapplied to the outside of the clothing fabric, while shield-ing the capacitive influence from the human body.
2 Related workSome wearable computers use physical dials, buttons,
or touch-pads as input devices [10]. These devices are usedto select menus or control nearby ubiquitous computers orappliances. We are aiming at similar applications by usingmore unobtrusive devices.Baudel and Beaudouin-Lafon demonstrated a hand-gesture
input system that is used as a remote control method [1].A wearer can control a presentation system by using hand-gestures. Since this system is based on “DataGlove” and anattached position sensor, a user has to first put on a glove touse it. In contrast, our solution aims to be more seamless;using wearable input devices requires no particular prepa-ration.GesturePendant is a camera-based gesture recognition
system that can be worn like a pendant [9]. A user can handgesture in front of it while it is worn around the neck. Thecurrent prototype is still noticeably bigger than an idealone, and would presupposedly always wear it over theirclothes.Wireless FingerRing is a hand-worn input device con-
sisting of acceleration-sensitive finger rings and a wristband-type receiver [3]. A user puts on four rings, and taps on aflat surface with one finger. This is detected by the ring’ssensor, and the information is transmitted to the wristbandreceiver through an on-body network. Acceleration Sens-ing Glove also uses an acceleration sensor on each finger-tip [6]. While wearing one finger ring is common and so-cially accepted, putting on four rings is unusual and thusit is unlikely all of us would do it. Supplying sufficientpower to operate all the finger rings is an additional un-solved technical problem.Measuring muscle tension (electromyogram, or EMG)
and using the information as computer inputs has beenwidely studied [12]. This method is important for peoplewith physical disabilities. However, it also involves somedifficulties. One problem is placing the electrode. To cor-rectly measure electricity, electrodes must have direct con-tact to the skin, often requiringwet-conductive gel. At leasttwo (and often at least three) electrodes need to be attachedto the skin, and maintain certain distances. These require-ments make it difficult to configure a simple wristband-type EMG sensor that can be easily worn. Our methodmeasures the cross-sectional shape of the wrist, instead ofusing an EMG, to detect hand motions.
LPF
ADConverter
AnalogswitchTransmitter Receiver
WaveSignal
Transmitter Receiver
Figure 1: A capacitive sensor is used to measure distancebetween sensor electrodes and an object.
3 Technological backgroundBefore describing our proposed input devices, we briefly
introduce their sensing technologies.3.1 Capacitance sensing“Capacitance sensing” is a technique for measuring dis-
tances of nearby conductive objects by measuring the ca-pacitance between the sensor and the object and uses atransmitter and a receiver electrode (Figure 1). When thetransmitter is excited by a wave signal (of typically severalhundred kilohertz), the receiver receives this wave. Themagnitude of the receiving signal is proportional to the fre-quency and voltage of the transmitted signal, as well as tothe capacitance between the two electrodes.When a conductive object is close to both electrodes, it
also capacitively couples to the electrode and strengthensthe receiving wave signal amplitude. When a conductiveand grounded object is close to both electrodes, it capaci-tively couples to the electrodes, drains the wave signal, andthus weakens the received signal amplitude. By measuringthese effects, it is possible to detect the proximity of con-ductive objects.The received signal often contains noises from nearby
electric circuits and inverters of fluorescent lamps. Toaccurately measure signals from the transmitter electrodeonly, a technique called “lock-in amplifier” can be used.This technique uses an analogue switch as a phase-sensitivedetector. A control signal is used to switch it on and off,to select signals that have the synchronized frequency andphase of the transmitted signal. Normally, a control sig-nal needs to be created by phase-locking the incoming sig-nal, but for capacitive sensing, the system can simply use atransmitted signal, because the transmitter and the receiverare both on the same circuit board.This capacitive sensing technique is mainly used for
proximity and position sensors [15]. In our work, capaci-tive sensing is used for measuring the arm shape by plac-ing both the transmitter and the receiver electrodes on awristband, and for measuring finger positions by attachingelectrodes on the inside of clothes.
§ Wave signal is transmitted§ The receivers are synchronized§ The received strength is
proportional to the distance
(Rekimoto, 2001)
19Samstag, 27. April 13
Actuator vibratesmeasure the capacitance of the wrist and the receiver electrodesmeasuring the distance between wristband an wrist
Gesture Wrist
§ Distinguish ‘Point’ and ‘Fist’ pose
(Rekimoto, 2001)
Gesture Wrist - Technology
20Samstag, 27. April 13
Clear difference between point and fistOnly two gestures used to differentiate gestures
Gesture Wrist - Examples
§ Distinguish ‘Point’ and ‘Fist’ pose§ Combined with an accelerometer§ Rotation also recognizable
(Rekimoto, 2001)
21Samstag, 27. April 13
Only two gestures used to differentiate gesturesUse rotation to control slider or knob
Gesture Wrist
§ Pro§ Small, watch like design§ Sensor embedded inside accessory§ Simple recognition method
§ Contra§ Only a small set of gestures can be recognized
(Rekimoto, 2001)
22Samstag, 27. April 13
Hand Shape with Wrist Contour
§ Hands free gestures§ Wrist watch like design
(Fukui et al, 2011)
23Samstag, 27. April 13
Hand Shape with Wrist Contour - Technology
§ Static wrist band§ Photo reflectors§ Senses distance between
band and skin
Wrist contourWrist cross sectionHand shape
Flexor and extensor carpi
Flexor and extensor pollicis
Flexor and extensor digitorum
0 X
YY
X0
Figure 2. Wrist contour basis.
Hand shape classification
Output to distance conversion Feature extraction
Data collection and transfer
Measurement of wrist contour by photo reflector array
Sensor device
PCRF
Figure 3. Data flow block diagram.
shows examples of hand shapes and wrist contour sets. Mus-cles and tendons for finger movements are compacted nearthe elbow. Around the wrist, however, tendons and musclesare separated to some extent, so they are comparatively ob-servable. We observed the variation of their thicknesses andpositions, which vary with finger movements. For example,to bend a finger, a flexor contracts and the nearby wrist sur-face dents. To straighten a finger, a flexor relaxes and thenearby wrist surface becomes as before. Our approach is torecognize hand shapes from these variations.
WRIST CONTOUR MEASURING SYSTEMFigure 3 shows our system configuration and data flow dia-gram. We developed a wrist watch type sensor device (Fig-ure 4) and a recognition system.
Required specificationHuman constraints and our design are as follows.• Human constraints:(1a) Muscles and tendons for finger movements are approx-imately 5 mm in diameter. (1b) Radial variation of wristcontour is approximately 5 mm at maximum.(2a) Wrist circumference is approximately 150!170 mm.(2b) Human arm motions should not be interrupted.• Design:(1a) Sensor pitch is 2.5 mm around circumference. (1b) Ra-dial resolution of the sensors is 0.1 mm.(2a) Measurement area is at least 170 mm in circumference.(2b) The band is narrower than 30 mm.To achieve the design requirements, we adopted photo re-flector sensors and shift register switching method.
Photo reflector as distance sensorPhoto reflector is a combination of infrared LED and phototransistor. LED transmits an infrared signal and Photo tran-
Measurement band
Control board(front,rear)
Battery and control part
Measurement part
Pitch:2.5mm
Photo reflector
ZigBee module
Spacer
Micro controller
25 m
m12
mm
Cross section
Fixing band
Control board
Battery
Figure 4. Wrist contour measuring device.
Infrared signal
2.5mm
Figure 5. Mechanism of photo reflector.
sistor detects the intensity of the signal reflected at the sur-face of the object as shown in Figure 5. We selected a smallphoto reflector sensor ”NJL5901AR-1” (produced by NewJapan Radio Co.) to achieve the measurement density 2.5mm.Because an output of photo reflector is non-linear with dis-tance, and sensors have individual differences, raw outputscannot be used for measuring distances as they are. Then, wecalibrated the outputs by prior measurement. We measuredrange of 0!10mm with 0.05mm pitch with 1-axis automaticstage to achieve 0.1mm radial resolution. As a result, weachieved 0.1mm resolution in 0!3.5mm. As figure 6 indi-cates, the smooth surface of an inclined flat board can berecognized in the range of 0!3.5mm.
0
50
100
150
200
250
1 11 21 31 41 51 61 71
Dis
tanc
e (m
m)
A/D
Con
v. o
utpu
t (8-
bit)
Sensor number
250
200
150
100
50
01 11 21 31 41 51 61 71
10
8
6
4
2
0
Raw dataDistance
10mm
Figure 6. Measuring an inclinedboard.
Clock
Output
D Q D Q D QControlVcc
Q1 Q2 Q3
ClockControl
Q1Q2Q3
Signal
Circuit
ShiftRegister
Figure 7. Shift register switch-ing method.
Shift register switching methodTo measure the whole circumference of wrist contours, wearranged photo reflector sensors in rows. We mounted them
Paper Session: Home and Away UbiComp'11 / Beijing, China
312
(Fukui et al, 2011)
24Samstag, 27. April 13
150 sensors
Hand Shape with Wrist Contour - Demo
(Fukui et al, 2011)
25Samstag, 27. April 13
static image representing gesture
Hand Shape with Wrist Contour - Demo
(Fukui et al, 2011)
25Samstag, 27. April 13
static image representing gesture
Hand Shape with Wrist Contour - Examples
(Fukui et al, 2011)
26Samstag, 27. April 13
The recognized gesture setsome gestures quiet similar
Hand Shape with Wrist Contour - Accuracy
(Fukui et al, 2011)
27Samstag, 27. April 13
Confusion matrixwide spreadboosting method and k-NN method rather simplediagonal is correctly recognized
Hand Shape with Wrist Contour
§ Pro§ Small, watch like design§ Can be hidden inside accessory§ New approach to gesture recognition
§ Contra§ Bad recognition rate§ Limited set of gestures
(Fukui et al, 2011)
28Samstag, 27. April 13
Digits
§ Recover full 3D hand model§ Cheap hardware§ Low power
(Kim et al, 2012)
29Samstag, 27. April 13
Already partly presented by Professor Hilliges in the introduction of the seminarmore sophisticatedimitates data glove
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
(Kim et al, 2012)
30Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingersKnowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on And finally use a kinema;cs model to recover the full hand configura;on
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
(Kim et al, 2012)
30Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingersKnowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on And finally use a kinema;cs model to recover the full hand configura;on
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
(Kim et al, 2012)
30Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingersKnowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on And finally use a kinema;cs model to recover the full hand configura;on
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
(Kim et al, 2012)
30Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingersKnowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on And finally use a kinema;cs model to recover the full hand configura;on
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
(Kim et al, 2012)
30Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingersKnowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on And finally use a kinema;cs model to recover the full hand configura;on
Digits - Examples
(Kim et al, 2012)
31Samstag, 27. April 13
accurate
Digits - Demo
(Kim et al, 2012)
32Samstag, 27. April 13
shootinggrabbingpulling
Digits - Demo
(Kim et al, 2012)
32Samstag, 27. April 13
shootinggrabbingpulling
Digits
§ Pro§ Portable§ Intern processing§ Accurate replacement for data glove
§ Contra§ As obtrusive as a data glove§ Occlusion is major problem
(Kim et al, 2012)
33Samstag, 27. April 13
Towards bimanual gestures
34Samstag, 27. April 13
previous papers all tried to reconstruct a model of the hand in a more or less accurate fashionIn the next paper we will see a move away from reconstructiontowards using the second hand for input and the first hand as a trigger
Gesture Watch
§ Contact free interface§ Unobtrusive
(Kim et al, 2007)
35Samstag, 27. April 13
device recognizing other handwearing arm used to initiate gesture
Gesture Watch - Technology
Sensor signal
Recognized gesture
(Kim et al, 2007)
36Samstag, 27. April 13
4 proximity Sensors arranged in a cross+ 1 for initiating towards the hand binary 0/1 sensors
Gesture Watch - Examples
(Kim et al, 2007)
37Samstag, 27. April 13
proposed gestures
Gesture Watch
§ Pro§ Unobtrusive desgin§ Sensors embedded§ Contact free§ Private
§ Contra§ Requires action from second hand to start gesture
(Kim et al, 2007)
38Samstag, 27. April 13
private by hiding the gesture from other people
What if we could eliminate all instrumentation?
39Samstag, 27. April 13
But still, instrumentation of the user is requiredTo get hands freeTo be cheaper
Sound Wave
§ No instrumentation of user§ Reusing existing hardware
(Gupta et al, 2012)
40Samstag, 27. April 13
Reuses speakers and microphone from an existing laptop
Sound Wave - Technology
formed and sensed [4]. While these projects show the po-
tential of low-cost sonic gesture sensing, they require cus-
tom hardware, which is a significant barrier to widespread
adoption. In our work, we focus on a solution that works
across a wide range of existing hardware to facilitate im-
mediate application development and adoption.
THE SOUNDWAVE SYSTEM SoundWave uses existing speakers on commodity devices
to generate tones between 18-22 kHz, which are inaudible.
We then use the existing microphones on these same devic-
es to pick up the reflected signal and estimate motion and
gesture through the observed frequency shifts.
Theory of Operation The phenomenon SoundWave uses to sense motion is the
shift in frequency of a sound wave in response to a moving
object, an effect called the Doppler effect. This frequency
shift is proportional to source frequency and to the velocity
with which the object moves. In our approach, the original
source (the speakers) and listener (the microphone) are sta-
tionary, thus in absence of any motion, there is no frequen-
cy change. When a user moves his hand, however, it re-
flects the waves, causing a shift in frequency. This
frequency is measured by the microphone (��) and can be
described by the following equation, which is used for
Doppler radar as well as for estimating frequency changes
in reflection of light by a moving mirror [2]:
�� � �� � � � �� � ��� � �
������ �� � ������������������������������������������������������������� � ��������������������������������������������������������� � ������������������������������������������������ � ������������������������ Figure 2 shows the frequency of the signal (a) when no mo-
tion is present and when a hand is moved (b) away from or
(c) closer to the laptop. This change in frequency as a hand
moves farther or closer is one of the many characteristic
properties of the received signal that we leverage in detect-
ing motion and constructing gestures.
Algorithm & Implementation Details SoundWave generates a continuous pilot tone, played
through the device’s speakers at the highest possible fre-
quency (typically in the range of 18-22 kHz on commodity
audio systems). Although we have verified that SoundWave
can operate on audio down to 6 kHz, we favor tones above
18 kHz since they are generally inaudible [1]. Additionally,
the higher the frequency, the greater the shift for a given
velocity, which makes it computationally easier to estimate
motion at a given resolution. The upper bound is largely a
function of most laptop and phone speaker systems only
being capable of producing audio at up to 22 kHz. Fortu-
nately, we do not need much higher frequencies to sense the
relatively coarse gestures we are targeting.
Due to variations in hardware as well as filtering in sound
and microphone systems, SoundWave requires an initial
calibration to find the optimal tone frequency (no user in-
tervention is required). It performs a 500 ms frequency
sweep, and keeps track of peak amplitude measurements as
well as the number of candidate motion events detected
(i.e., potential false positives). SoundWave selects the high-
est frequency at which minimum false events are detected
and the peak is most isolated (i.e., the amplitude is at least
3 dB greater than next-highest peak in the sweep range).
The system consistently favors the 18-19 kHz range.
With the high-frequency tone being emitted, any motion in
proximity (around 1 m depending on speed) of the laptop
will cause Doppler-shifted reflections to be picked up by
the microphone, which is continuously sampled at
44.1 kHz. We buffer the incoming time-domain signal from
the microphone and compute the Fast Fourier Transform
(FFT) with 2048-point Hamming window vectors. This
yields 1024-point magnitude vectors that are spread equally
over the spectral width of 22.05 kHz. After each FFT vector
is computed, it is further processed by our pipeline: signal
conditioning, bandwidth extraction, motion detection, and
feature extraction.
Signal Conditioning: Informal tests with multiple people
indicated that the fastest speed at which they could move
their hands in front of a laptop was about 3.9 m/sec. Hence,
we conservatively bound signals of interest at 6 m/sec. Giv-
en our sampling rate and FFT size, this yields about 33 fre-
quency bins on either side of the emitted peak.
Bandwidth Extraction: As seen in Figure 2, motion around
the device creates a shifted frequency that effectively in-
creases the bandwidth of the pilot tone (i.e., window aver-
aging and spectral leakage blur the movement of the peak).
To detect this, SoundWave computes the bandwidth of the
pilot tone by scanning the frequency bins on both sides in-
Figure 2: (a) Pilot tone with no motion. (b and c) Increase in bandwidth on left and right due to motion away from and towards the
laptop respectively. (d) Shift in frequency large enough for a separate peak. A single scan would not capture the true shift in fre-quency and would terminate at the local minima. A second scan compensates for the bandwidth of the shifted peak.
Session: Sensory Interaction Modalities CHI 2012, May 5–10, 2012, Austin, Texas, USA
1912
(Gupta et al, 2012)
41Samstag, 27. April 13
Doppler effectEmitted sound 18 - 22 kHzInput sampled -> FFT22.05kHz spectrum divided into 33 binsscanned until amplitude drops below 10%second scan until 30% away from pilot tone
Sound Wave - Technology
formed and sensed [4]. While these projects show the po-
tential of low-cost sonic gesture sensing, they require cus-
tom hardware, which is a significant barrier to widespread
adoption. In our work, we focus on a solution that works
across a wide range of existing hardware to facilitate im-
mediate application development and adoption.
THE SOUNDWAVE SYSTEM SoundWave uses existing speakers on commodity devices
to generate tones between 18-22 kHz, which are inaudible.
We then use the existing microphones on these same devic-
es to pick up the reflected signal and estimate motion and
gesture through the observed frequency shifts.
Theory of Operation The phenomenon SoundWave uses to sense motion is the
shift in frequency of a sound wave in response to a moving
object, an effect called the Doppler effect. This frequency
shift is proportional to source frequency and to the velocity
with which the object moves. In our approach, the original
source (the speakers) and listener (the microphone) are sta-
tionary, thus in absence of any motion, there is no frequen-
cy change. When a user moves his hand, however, it re-
flects the waves, causing a shift in frequency. This
frequency is measured by the microphone (��) and can be
described by the following equation, which is used for
Doppler radar as well as for estimating frequency changes
in reflection of light by a moving mirror [2]:
�� � �� � � � �� � ��� � �
������ �� � ������������������������������������������������������������� � ��������������������������������������������������������� � ������������������������������������������������ � ������������������������ Figure 2 shows the frequency of the signal (a) when no mo-
tion is present and when a hand is moved (b) away from or
(c) closer to the laptop. This change in frequency as a hand
moves farther or closer is one of the many characteristic
properties of the received signal that we leverage in detect-
ing motion and constructing gestures.
Algorithm & Implementation Details SoundWave generates a continuous pilot tone, played
through the device’s speakers at the highest possible fre-
quency (typically in the range of 18-22 kHz on commodity
audio systems). Although we have verified that SoundWave
can operate on audio down to 6 kHz, we favor tones above
18 kHz since they are generally inaudible [1]. Additionally,
the higher the frequency, the greater the shift for a given
velocity, which makes it computationally easier to estimate
motion at a given resolution. The upper bound is largely a
function of most laptop and phone speaker systems only
being capable of producing audio at up to 22 kHz. Fortu-
nately, we do not need much higher frequencies to sense the
relatively coarse gestures we are targeting.
Due to variations in hardware as well as filtering in sound
and microphone systems, SoundWave requires an initial
calibration to find the optimal tone frequency (no user in-
tervention is required). It performs a 500 ms frequency
sweep, and keeps track of peak amplitude measurements as
well as the number of candidate motion events detected
(i.e., potential false positives). SoundWave selects the high-
est frequency at which minimum false events are detected
and the peak is most isolated (i.e., the amplitude is at least
3 dB greater than next-highest peak in the sweep range).
The system consistently favors the 18-19 kHz range.
With the high-frequency tone being emitted, any motion in
proximity (around 1 m depending on speed) of the laptop
will cause Doppler-shifted reflections to be picked up by
the microphone, which is continuously sampled at
44.1 kHz. We buffer the incoming time-domain signal from
the microphone and compute the Fast Fourier Transform
(FFT) with 2048-point Hamming window vectors. This
yields 1024-point magnitude vectors that are spread equally
over the spectral width of 22.05 kHz. After each FFT vector
is computed, it is further processed by our pipeline: signal
conditioning, bandwidth extraction, motion detection, and
feature extraction.
Signal Conditioning: Informal tests with multiple people
indicated that the fastest speed at which they could move
their hands in front of a laptop was about 3.9 m/sec. Hence,
we conservatively bound signals of interest at 6 m/sec. Giv-
en our sampling rate and FFT size, this yields about 33 fre-
quency bins on either side of the emitted peak.
Bandwidth Extraction: As seen in Figure 2, motion around
the device creates a shifted frequency that effectively in-
creases the bandwidth of the pilot tone (i.e., window aver-
aging and spectral leakage blur the movement of the peak).
To detect this, SoundWave computes the bandwidth of the
pilot tone by scanning the frequency bins on both sides in-
Figure 2: (a) Pilot tone with no motion. (b and c) Increase in bandwidth on left and right due to motion away from and towards the
laptop respectively. (d) Shift in frequency large enough for a separate peak. A single scan would not capture the true shift in fre-quency and would terminate at the local minima. A second scan compensates for the bandwidth of the shifted peak.
Session: Sensory Interaction Modalities CHI 2012, May 5–10, 2012, Austin, Texas, USA
1912
(Gupta et al, 2012)
41Samstag, 27. April 13
Doppler effectEmitted sound 18 - 22 kHzInput sampled -> FFT22.05kHz spectrum divided into 33 binsscanned until amplitude drops below 10%second scan until 30% away from pilot tone
Sound Wave - Demo
(Gupta et al, 2012)
42Samstag, 27. April 13
Wake up and sleep automaticallycontrol media player
Sound Wave - Demo
(Gupta et al, 2012)
42Samstag, 27. April 13
Wake up and sleep automaticallycontrol media player
Sound Wave
§ Pro§ No instrumentation of user§ Accurate results§ Even in noisy environments
§ Contra§ Base tone may be hearable
(Gupta et al, 2012)
43Samstag, 27. April 13
All sensors need a network
44Samstag, 27. April 13
To conclude we have a look at a completely different paper that discusses how the body itself can be used as a network for communication
Gesture Pad
§ The body as touch interface§ The body as network§ The body as transceiver
(Rekimoto, 2001)
45Samstag, 27. April 13
Taken from the paper of Gesture Wrist, the capacitance sensing wrist sensorCommunicate between themselfesSend data to (touched) outside worldHumantenna inverted
Gesture Pad
Figure 4: Relation between hand shape and obtained val-ues.
Figure 5: Example gesture commands
to a vector space (three dimensional, in this case), and apoint in this space corresponds to a hand shape.Figure 4 shows measured sensor values and their cor-
responding hand shapes. As shown here, the system candistinguish two hand shapes, grasping and pointing clearly.4.2 Forearm movement measurementIn addition to the hand-shape measurement, an accelera-
tion sensor (Analog Devices ADXL202) is mounted on the
transmitterreceiver
body
shield layerfabric
A
transmitter
body
receiver
fabricshield layer
B
body
transmitter
fabricshield layer
B’
receiver
Figure 6: Sensor configurations for GesturePad
wristwatch dial. This sensor is a solid-state 2-axis sensorand measures the inclination of the forearm.
4.3 Tactile feedbackWhen a gesture is recognized, the GestureWrist gives
feedback to the user by tactile sensation. On the insideof the wristwatch dial, a ceramic piezoelectric-actuator isattached to produce the feedback. We use 20-Hz square-wave signals to excite this actuator.
4.4 Combining two sensor inputsBy combining these two inputs, we designed simple
gesture commands. We selected two hand shapes (makinga fist and pointing) and six different arm positions (palm
body
fabric
receivershield layer
transmitter
Figure 7: Variation of GesturePad Type-B which is usedin combination with GestureWrist. This module receives asignal from the GestureWrist through the body.
up, palm right, palm left, palm down, forearm up, and fore-arm down). The hand shapes are used to separate gesturecommands into segments, and two consecutive arm posi-tions (e.g., palm left palm down) make up one inputcommand. Examples of gesture commands are shown inFigure 5.Continuouslyadjust parameters is also possible by twist-
ing the forearm. For example, a user can first decide whichparameter to change, and control it by rotating his or herforearm.Based on our experience, absolute values from capac-
itive sensors gradually change over a certain time period.This is mainly because the position of the wristband movesover time. On the other hand, the derivative of the capac-itive values reflects the hand motion (e.g., from graspingto pointing) consistently. We are currently integrating thisfeature for to add stability and robustness to gesture recog-nition.
5 GesturePad: A sensor module for interac-tive clothingOur next trial is to transform conventional clothes into
interactive objects. Previous workon interactive clothes [7],have used metallic yarns woven into fabrics. This approachrequires specially designed clothes, and is difficult to applyto clothes that already exist. We chose a “retrofit” approachthat allows users to attach interactive modules to clotheseasily. In addition, we particularly concentrated on mak-ing the attachment as unnoticeable as possible. We believethat clothes are a highly social media, and thus attachingobtrusive devices (such as [10]) is not an ideal solution.The GesturePad, is a module that consists of a layer of
sensors that can be attached to the inside of clothes. Awearer can control this module from the outside. As aresult, a part of the clothes becomes interactive withoutchanging its appearance.5.1 Sensor configurationsFigure 6 shows three configurations of the GesturePad.
All types can be attached to the clothes on the inside, and
Figure 8: GesturePad prototype.
the wearer controls it from the outside.Figure 6-(A), shows Type-A, which consists of an array
of capacitive sensors (a combination of transmitters and re-ceivers) and a shield layer attached to the behind. Eachvertical grid line is a transmitter and each horizontal line areceiver electrode. The sensing of both the transmitter andthe receiver is time-multiplexed, so the sensor can inde-pendently measure the capacitance value of each electrodecrossing point.When a user’s finger is close enough to the sensor sur-
face (typically within 1 cm), the sensor grid recognizes thefinger position. During this operation, the shield layer at-tached on the backside of the module blocks influence fromthe wearer’s body. For example, when a module is placedon the inside of a lapel, a finger stroke gesture on the lapelbecomes an input to the computer. This could enable con-trolling the volume of a worn MP3 player. Multiple sensorpoints on the module also enable multiple finger inputs.For example, a chording-keyboard type input would alsobe possible.Figure 6-(B) and (B’) show another sensor structure,
Type-B (and B’), that consists of a transmitter and a re-ceiver layer separated by a shield layer. In this configu-ration, a signal from the transmitter layer is capacitivelycoupled to a receiver layer through the user’s body (i.e.,on-body network). When the user’s finger is within prox-imity of the GesturePad, a wave signal from the transmitterelectrode is transmitted to the receiver one. This type couldbe put in a trouser pocket and operated from the outside ofthe pocket. One benefit of this configuration is that it canprevent other people from interacting with the sensor.The Type-B(B’) can also use an array of sensor elec-
trodes so the user’s finger motion is detected by comparingthe received signal amplitudes. The difference between Band B’ is the placement of transmitter and receiver elec-trodes. The Type-B places multiple transmitter electrodeson the front side and one receiver on the backside, whileType-B’ uses multiple receiver electrodes on the front side.Since multiple transmitters can be easily implemented bytime-multiplexing single transmitter, the needed hardwarefor Type-B is smaller than that of Type-B’.Our current prototype for this Type-B integrates a trans-
(Rekimoto, 2001)
46Samstag, 27. April 13
A: Transmitter/receiver multiplexedB: Shield layer separates transmitter from receiver
Gesture Pad
§ Further Ideas§ Use NFC transceivers inside pads§ Identify person touching by there signal
(Rekimoto, 2001)
47Samstag, 27. April 13
Comparison
Mobility Accuracy Instrumentation Main Application
Muscle Computer Interface
Designed for mobile use,data sent via wifi/BT
65% busy hand, no feedback, 4 fingers91% busy hand,feedback, 3 fingers
An arm band at the upper forearm
Gesture recognition with busy hands
Gesture Wrist(Capacity sensing)
Designed for mobile use,data sent via body network N/A Wrist watch like utility Hand shape recognition,
authentication
Wrist Shape(Photosensors)
Designed for mobile use,offline processing atm. 45-48% Wrist watch like utility Hand shape recognition
Digits(3D reconstruction)
Designed for mobile use,data sent via wifi/BT
91%, varying from finger to finger
Small camera worn at a wrist band
Reconstructing 3D model of hand
Gesture Watch(in air over hand)
Designed for mobile use,data sent via wifi/BT 95 % Wrist watch like utility Simple gesture recognition
using one hand
Sound Wave(in air over laptop) Bound to Laptop 90-95% None, using existing
hardwareAdd simple gesture recognition to laptop
48Samstag, 27. April 13
Different aspect that would maybe required from a gesture based interface
Summary and Future Technology
§ Today§ Gesture recognition is feasible§ Ranging accuracy§ Integration is still complicated
§ In the future we ...§ need to control unobtrusively§ can authenticate with an accessory§ wear touchable cloth§ use the body as a network
49Samstag, 27. April 13
50Samstag, 27. April 13
Vaporware!?commercial from myoforesight of how gesture interaction could look like
50Samstag, 27. April 13
Vaporware!?commercial from myoforesight of how gesture interaction could look like
“Any sufficiently advanced technology is indistinguishable from magic.”Arthur C. Clarke
51Samstag, 27. April 13