Toys, Entertainment robots, Videos Games: Challenges in Design and Control
Department of Mechanical Engineering University of Texas at San Antonio
Pranav Bhounsule
March 23, 2018
(1) Toys that walk/run
Walking robot: Honda Asimo
Wilson Walker
Wilson walker(A) (B) (C)
Ravert patent(A) (B) (C)
UTSA mascot, Rowdy
Rowdy Walker
Methods & Challenges• Leg Design
• Mass Distribution
• Integrated Hinge
• Support Material
• Commercialization?
• Time
• Cost Front leg is fixed to the body
Back leg is connected tobody through a hinge
Downhill Ramp
3D printed, linear, ON-OFF, pneumatic actuator
(a) (b) (c)
Cylinder Body
Cylinder Head
Cylinder End Cap
Piston Rod
O-Ring
Port A
Port B
(1)
(5)
(2)
(3)
Piston Head
(4)
(6)
Actuator working
Methods & Challenges(a)Pores (Acetone)
(b)Strength (Embedding)
(c)Piston - Cylinder interface
• Viton O-rings
• Waterproof greese
Disney’s Luxo Jr. Lamp
(2) Entertainment Robots
• Manually tuned
• Time consuming
Disney animatronics
Inverse kinematics
• Bhounsule & Yamane, Humanoids 2015
Issues with Kinematics model
• Flexible joints —-> Rigid body models are invalid • Low bandwidth control —> poor servo operation • High degrees of freedom —> Error magnification • Wear and Tear —> Part/link replacement
X
Y
Z
7
12
3
4 5
6
8
9
10
12 13 14 15 16 17 1820212223242526
(a) (b)
1119
A D
CB
E
Head
Left Hand
Right Hand
Problem: Move block to the target by applying an instantaneous force
TargetRamp has friction but incorrectly modeled
Instantaneous force, F
Iterative Learning Control (ILC ): 1-D example
Iterative Learning Control (ILC ): 1-D example
Model:
Target
x
F
D
x = f(F, µ)
Imprecise
F = f
�1(x, µ)
Imprecise
Inverse:
1-D example (trial 1)
Target
x D
F1 = f
�1(x, µ)Control (trial 1):
F1
d1 e1 = D � d1
1-D example (trial 2)
Target
x D
e2 = D � d2d2F2
Control (trial 2): F2 = F1 + �e1
1-D example (trial 2)
Target
x D
en = D � dndnFn
Converged when e_n is small
Our approach: Non-linear Inverse Kinematics (IK) update
Angle command trial i
End-effector reference
Desired end-effector for IK
✓i
�
Yref
Yi End-effector position trial i
Learning gain
Y ides
F̂ Estimated Forward Kinematics Model
where:
Find non-linear IKwithin joint limits
Our approach: Non-linear Inverse Kinematics (IK) update
Inverse Kinematics computation
Use nonlinear constraint optimization for IK
Cost: Bias toward a pose
End-effector constraint: Satisfy estimated
end-effector position
Joint constraint: Satisfy joint limits
Inverse kinematics with Iterative Learning Control
• Bhounsule & Yamane, Humanoids 2015
Results for writing task
0 2 4 6 8 10 12 14 16 180
0.005
0.01
0.015
0.02
0.025
0.03
0.035
Iteration Number
Aver
age
Squa
re E
rror,
norm
(rad
)2
−0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2−0.3
−0.25
−0.2
−0.15
−0.1
−0.05
x (m)
y (m
)0
ReferenceTrial 1Trial 18
−0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.10.24
0.26
0.28
0.3
0.32
0.34
0.36
0.38
x (m)
z (m
)
ReferenceTrial 1Trial 18
•Convergence: 18 trials •Trial 1: Error ~ 1e-3 •Trial 18: Error ~ 1e-5
Other tasks
• Bhounsule & Yamane, IJHR 2017
(3) Video Games
Flappy Bird Game (iPhone/android)
Control: Tap screen to navigate bird through pipes
Scoring: 1 point/pipe passed
Objective: Maximize points.
Flappy Bird GameHistory:
• May 2013: Game released
• Jan 2014: Most downloaded game on iTunes, earning $50,000/day (?)
• Feb 2014: Game removed from iTunes by developer citing its addictive nature
Flappy Bird, simple concept but difficult to achieve high score
How to beat Flappy Birddownloaded from a YouTuber
Past workMachine Learning
• Reinforcement learning,
• Q-learning, and
• Support Vector Machines.
• Select features,
• Learn state-action pairs
• Scores ~100-1500
Physics
Y - vertical height (up -)
V - vertical velocity
g - gravity (=0.1356)
z - control (flap or not flap)
constant horizontal velocity
#1: Heuristic controller & manual tuning
don’t jumpjump
PipePos1c
setPointYh
Set-point based control
Set-point is tuned.
Results: Heuristic controller & manual tuning
Average score 56.6/500
Results: Heuristic controller & manual tuning
#2: Optimization with manually tuned constraints
constraint lines
48
10
7
16
10
28
bounding box
NOTE: All dimensionsare in pixels
80
constraints
prediction horizon
optimized path
PipePos1
PipePos2
h
endYlow
endYhigh
endVel
Bounding box constraints Terminal constraints
#2: Optimization with manually tuned constraints
constraints
prediction horizon
optimized path
PipePos1
PipePos2
h
endYlow
endYhigh
endVel
Terminal constraintsMinimize number of jumps
Input: Jump or not (z=0 or 1 resp.) for horizontal distance bet. pipes.
Constraints:
1) Physics (big M method)
2) Bounding box constraint (pipes)
3) Terminal constraints (exit) [3 conditions parameters]
Mixed Integer Programming software Gurobi (intlinprog)
Results: Optimization-based control, optimization horizon fixed
Perfect score500/500
Results: Optimization-based control, optimization horizon fixed
#3: Model Predictive Control Same as #2 but with TWO differences:
1) no terminal constraint
2) prediction horizon (n),
control horizon is 1 step.
optimized path
constraints
control horizon
prediction horizon
[n is the only free parameter]
Results: Model Predictive Controller
0.625 0.6875 0.750 0.8125 0.8750 0.8750 1.000 1.0625 1.125Prediction Horizon (x Horizontal distance between pipes)
0
500
1000
1500
2000
2500
3000
3500
4000
4500
Score
Best fit line
exp(0.004 t^2 - 0.264 t - 3.6976)
Data
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
0.055
0.06
0.065
Ave
rage
Opt
imiz
atio
n Ti
me
(s)
Data
Best fit lineexp(0.002 t^2 - 0.2192 t + 11.7962)
0.625 0.6875 0.750 0.8125 0.8750 0.8750 1.000 1.0625 1.125Prediction Horizon (x Horizontal distance between pipes)Prediction HorizonPrediction Horizon
Scor
e
Com
puta
tion
time
Results: Model Predictive Controller (with optimum prediction horizon of 90)
Average score 419/500
Results: Model Predictive Controller
80 time steps10 time steps
90 time steps
55 time steps
horizontal distancebetween pipes
Optimal prediction window is 90 ~1.125 horizontal distance between pipes
Key message: Need to plan slightly beyond the next pipe
Results: Model Predictive Controller
Discussion
Heuristic controller Optimization MPC
Score (10 runs, max 500 pipes) 56.6 500 419
Worst case time (sec) ~0 1.3 3.9
Tuning Trial and error tuing
Trial and error tuning
Can be automated
Conclusion• Position and speed on exiting the pipe seems
to be key factors for good performance
• Optimization/MPC are too slow for real time implementation
• MPC best compromise between scores and need for intuitive tuning