1
Declarative Gesture Spotting Using
Inferred and Refined Control Points
Lode Hoste
Brecht De Rooms
Beat Signer
Department of Computer Science, Vrije Universiteit Brussel (VUB)
-
-
-
2
Gesture Classification
e.g. Dynamic Time Warping
sample segment
template
3
Gesture Classification
e.g. Dynamic Time Warping
sample segment
template
4
Gesture Classification
5
Gesture Classification
6
Gesture Classification
7
Gesture Classification
8
Gesture Classification
9
Gesture Classification
Too Complex
10
Gesture Classification
We need gesture spotting
Start?
End?
11
Gesture Classification
We need gesture spotting
Start?
End?
reduce amount of work for gesture
classification
12
Gesture Spotting
Goal:
- High Recall
- High Precision
Popular techniques:
- Hidden Markov Models
- Continuous Dynamic Programming
13
- from one representative sample
Our Declarative Gesture Spotting Approach
1. Automatically infer control points
2. Generate declarative code
- understandable and extensible spotting definitions
3. Call a gesture classification algorithm
- one shot learning
Spotted?
14
Automatically Infer Control Points
Z gesture
One representative sample
15
Automatically Infer Control Points
Z gesture
One representative sample
- start with the first point
16
Automatically Infer Control Points
Z gesture
One representative sample
- search for the smallest angle
- within a given time-frame
- start with the first point
17
Automatically Infer Control Points
Z gesture
One representative sample
- search for the smallest angle
- within a given time-frame
- start with the first point
18
Automatically Infer Control Points
Z gesture
One representative sample
- search for the smallest angle
- within a given time-frame
- start with the first point
- refine control points (optionally)
19
Comprehensible Code Generation
Based on the inferred control points:
20
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D) there is a start point
?p1
Based on the inferred control points:
21
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D) 3 ?p2 (Point2D) there is a second point ?p1 ?p2
Based on the inferred control points:
22
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D)
4 (test (< ?p1.time ?p2.time)) 3 ?p2 (Point2D)
temporal constraint:
that comes after the start point
?p1 ?p2
Based on the inferred control points:
23
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D)
4 (test (< ?p1.time ?p2.time)) 3 ?p2 (Point2D)
5 (test (inside_control_point ?p1 ?p2 185 4 76))
spatial constraint:
?p1 ?p2
Based on the inferred control points:
24
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D)
4 (test (< ?p1.time ?p2.time)) 3 ?p2 (Point2D)
5 (test (inside_control_point ?p1 ?p2 185 4 76))
spatial constraint: .. a circular area with centerpoint
x: 185 y:4 (pixels)
?p1 ?p2
185 4
Based on the inferred control points:
25
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D)
4 (test (< ?p1.time ?p2.time)) 3 ?p2 (Point2D)
5 (test (inside_control_point ?p1 ?p2 185 4 76))
spatial constraint:
and radious 76
.. a circular area with centerpoint
x: 185 y:4 (pixels)
?p1 ?p2
185 4
Based on the inferred control points:
26
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D)
4 (test (< ?p1.time ?p2.time)) 3 ?p2 (Point2D)
5 (test (inside_control_point ?p1 ?p2 185 4 76))
?p1 ?p2
185 4
Based on the inferred control points:
27
6 ?p3 (Point2D) 7 (test (< ?p2.time ?p3.time)) 8 (test (inside_control_point ?p1 ?p3 15 178 76)) 9 ?p4 (Point2D) 10 (test (< ?p3.time ?p4.time)) 11 (test (inside_control_point ?p1 ?p4 197 175 76))
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D)
4 (test (< ?p1.time ?p2.time)) 3 ?p2 (Point2D)
5 (test (inside_control_point ?p1 ?p2 185 4 76))
?p1 ?p2
185 4
?p3 ?p4
Based on the inferred control points:
28
6 ?p3 (Point2D) 7 (test (< ?p2.time ?p3.time)) 8 (test (inside_control_point ?p1 ?p3 15 178 76)) 9 ?p4 (Point2D) 10 (test (< ?p3.time ?p4.time)) 11 (test (inside_control_point ?p1 ?p4 197 175 76))
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D)
4 (test (< ?p1.time ?p2.time)) 3 ?p2 (Point2D)
5 (test (inside_control_point ?p1 ?p2 185 4 76))
?p1 ?p2
each point expressed relative to ?p1
?p3 ?p4
Based on the inferred control points:
29
6 ?p3 (Point2D) 7 (test (< ?p2.time ?p3.time)) 8 (test (inside_control_point ?p1 ?p3 15 178 76)) 9 ?p4 (Point2D) 10 (test (< ?p3.time ?p4.time)) 11 (test (inside_control_point ?p1 ?p4 197 175 76))
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D)
4 (test (< ?p1.time ?p2.time)) 3 ?p2 (Point2D)
5 (test (inside_control_point ?p1 ?p2 185 4 76))
?p1 ?p2
?p3 ?p4
temporal constraints:
the points should be ordered
Based on the inferred control points:
30
12 => 13 (call DynamicTimeWarping 14 (select-between ?p1.time ?p4.time) 15 (gesture-set “CharacterZ”))
6 ?p3 (Point2D) 7 (test (< ?p2.time ?p3.time)) 8 (test (inside_control_point ?p1 ?p3 15 178 76)) 9 ?p4 (Point2D) 10 (test (< ?p3.time ?p4.time)) 11 (test (inside_control_point ?p1 ?p4 197 175 76))
Comprehensible Code Generation
1 (defrule symbol_z 2 ?p1 (Point2D)
4 (test (< ?p1.time ?p2.time)) 3 ?p2 (Point2D)
5 (test (inside_control_point ?p1 ?p2 185 4 76))
?p1 ?p2
?p3 ?p4
call an external algorithm for classification
Based on the inferred control points:
31
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
State 1
template
32
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
State 2
template
33
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
State 2
template
reset state machine?
34
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
template
State 1
reset state machine?
35
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
template
State 1
36
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
template
State 1
37
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
template
gesture overshot
missed gesture
38
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
template
39
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
template
refused by gesture
classification
40
A State Machine is Not Enough
continuous stream
State 1 State 2
State 3 State 4
template
overlapping gesture
missed gesture
refused by gesture
classification
41
1 (defrule symbol_z 2 ?p1 (Point2D)
Incremental Evaluation of Every Possible Combination
42
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
4 (test (< ?p1.time ?p2.time)) 5 (test (inside_control_point ?p1 ?p2 185 4 76))
Incremental Evaluation of Every Possible Combination
43
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
4 (test (< ?p1.time ?p2.time)) 5 (test (inside_control_point ?p1 ?p2 185 4 76))
1 (defrule symbol_z 2 ?p1 (Point2D)
Incremental Evaluation of Every Possible Combination
44
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
4 (test (< ?p1.time ?p2.time)) 5 (test (inside_control_point ?p1 ?p2 185 4 76))
1 (defrule symbol_z 2 ?p1 (Point2D)
Incremental Evaluation of Every Possible Combination
45
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
4 (test (< ?p1.time ?p2.time)) 5 (test (inside_control_point ?p1 ?p2 185 4 76))
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
1 (defrule symbol_z 2 ?p1 (Point2D)
Incremental Evaluation of Every Possible Combination
46
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
4 (test (< ?p1.time ?p2.time)) 5 (test (inside_control_point ?p1 ?p2 185 4 76))
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
1 (defrule symbol_z 2 ?p1 (Point2D)
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
Incremental Evaluation of Every Possible Combination
47
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
4 (test (< ?p1.time ?p2.time)) 5 (test (inside_control_point ?p1 ?p2 185 4 76))
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
1 (defrule symbol_z 2 ?p1 (Point2D)
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
Incremental Evaluation of Every Possible Combination
48
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
4 (test (< ?p1.time ?p2.time)) 5 (test (inside_control_point ?p1 ?p2 185 4 76))
1 (defrule symbol_z 2 ?p1 (Point2D)
RETE engine takes care
of caching intermediate calculations
3 ?p2 (Point2D)
1 (defrule symbol_z 2 ?p1 (Point2D)
1 (defrule symbol_z 2 ?p1 (Point2D)
3 ?p2 (Point2D)
Incremental Evaluation of Every Possible Combination
49
Incremental Evaluation of Every Possible Combination
?p1
50
Incremental Evaluation of Every Possible Combination
?p1
?p1
51
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
52
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
53
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
54
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
55
?p2
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
56
?p2
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p1
57
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p1
58
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p1
59
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p1
60
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p1 ?p1
61
?p2
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p1 ?p1
62
?p2
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p1
?p3
?p1
63
Limit calculations to
a sliding window
?p2
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p1
?p3
?p1
64
Limit calculations to
a sliding window
?p2
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p3
?p1
65
Limit calculations to
a sliding window
?p2
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p3
?p1
66
Limit calculations to
a sliding window
?p2
?p2
?p3
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p3
?p1
67
Limit calculations to
a sliding window
?p2
?p2
?p3 ?p4
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p1
?p3
?p1
68
Limit calculations to
a sliding window
?p2
?p2
?p3 ?p4
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p3
?p1
69
Limit calculations to
a sliding window
?p2
?p2
?p3 ?p4
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p3
?p1
70
Limit calculations to
a sliding window
?p2
?p2
?p3 ?p4
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p3
?p1
71
Limit calculations to
a sliding window
?p2
?p2
?p3 ?p4
?p2
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p3
?p1
72
Limit calculations to
a sliding window
?p2
?p2
?p3 ?p4
?p2
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p3
?p1
73
Limit calculations to
a sliding window
?p2
?p2
?p3 ?p4
?p2
Incremental Evaluation of Every Possible Combination
?p1
?p1
?p1
?p3
?p1
74
Extending the ‘Flick Right’ Definition
Reducing false positives using negation
?p1 ?p2 ?p3
75
Extending the ‘Flick Right’ Definition
Reducing false positives using negation
?p1 ?p2 ?p3
76
Extending the ‘Flick Right’ Definition
?p1 ?p2 ?p3
77
(not (and (Point2D (y ?b y) (time ?b time))
Extending the ‘Flick Right’ Definition
… there should not be a point where the …
(test (> (abs (- ?p1.y ?b y)) 245))))
(test (< ?b time ?p3.time))
(test (> ?b time ?p1.time))
?p1 ?p2 ?p3
78
(not (and (Point2D (y ?b y) (time ?b time))
Extending the ‘Flick Right’ Definition
… there should not be a point where the …
(test (> (abs (- ?p1.y ?b y)) 245))))
(test (< ?b time ?p3.time))
(test (> ?b time ?p1.time)) … time lies in between ?p1 and ?p3 …
?p1 ?p2 ?p3
79
(not (and (Point2D (y ?b y) (time ?b time))
Extending the ‘Flick Right’ Definition
… there should not be a point where the …
(test (> (abs (- ?p1.y ?b y)) 245))))
(test (< ?b time ?p3.time))
(test (> ?b time ?p1.time)) … time lies in between ?p1 and ?p3 …
… for which is greater than 245.
y
y
y
?p1 ?p2 ?p3
80
Evaluation 22 77.50 52.10 78.75 56.50 24 83.13 47.16 84.38 52.53 26 90.63 42.40 91.25 46.79 28 93.75 39.47 94.38 43.26 30 97.50 35.37 97.50 39.29 32 98.75 32.78 98.75 36.41
• 16 gestures
• 1760 gesture samples
• 10 subjects
81
Visual 3D gesture
definitions
Future Work
Scale / rotational
invariant spotting
82
[email protected] [email protected] [email protected]
Summary
• Comprehensible: easy to understand and to correct.
• Extensive spotting: support for real-time continuous streams.
Call classification
• No gesture overshooting: support for overlapping submatches.
• Extensible: allows for expert adjustments.
Auto infer control points Generate Spotting Code
83
Lode Hoste, Brecht De Rooms and Beat Signer, Declarative Gesture Spotting Using Inferred and Refined Control Points, Proceedings of ICPRAM 2013, International Conference on Pattern Recognition, Barcelona, Spain, February 2013