talk design control - College of Engineering€¦ · Flappy Bird Game History: • May 2013: Game...

Toys, Entertainment robots, Videos Games: Challenges in Design and Control

Department of Mechanical Engineering University of Texas at San Antonio

Pranav Bhounsule

March 23, 2018

(1) Toys that walk/run

Walking robot: Honda Asimo

Wilson Walker

Wilson walker(A) (B) (C)

Ravert patent(A) (B) (C)

UTSA mascot, Rowdy

Rowdy Walker

Methods & Challenges• Leg Design

• Mass Distribution

• Integrated Hinge

• Support Material

• Commercialization?

• Time

• Cost Front leg is fixed to the body

Back leg is connected tobody through a hinge

Downhill Ramp

3D printed, linear, ON-OFF, pneumatic actuator

(a) (b) (c)

Cylinder Body

Cylinder Head

Cylinder End Cap

Piston Rod

O-Ring

Port A

Port B

(1)

(5)

(2)

(3)

Piston Head

(4)

(6)

Actuator working

Methods & Challenges(a)Pores (Acetone)

(b)Strength (Embedding)

(c)Piston - Cylinder interface

• Viton O-rings

• Waterproof greese

Disney’s Luxo Jr. Lamp

(2) Entertainment Robots

• Manually tuned

• Time consuming

Disney animatronics

Inverse kinematics

• Bhounsule & Yamane, Humanoids 2015

Issues with Kinematics model

• Flexible joints —-> Rigid body models are invalid • Low bandwidth control —> poor servo operation • High degrees of freedom —> Error magnification • Wear and Tear —> Part/link replacement

X

Y

Z

7

12

3

4 5

6

8

9

10

12 13 14 15 16 17 1820212223242526

(a) (b)

1119

A D

CB

E

Head

Left Hand

Right Hand

Problem: Move block to the target by applying an instantaneous force

TargetRamp has friction but incorrectly modeled

Instantaneous force, F

Iterative Learning Control (ILC ): 1-D example

Iterative Learning Control (ILC ): 1-D example

Model:

Target

x

F

D

x = f(F, µ)

Imprecise

F = f

�1(x, µ)

Imprecise

Inverse:

1-D example (trial 1)

Target

x D

F1 = f

�1(x, µ)Control (trial 1):

F1

d1 e1 = D � d1


Target

x D

e2 = D � d2d2F2

Control (trial 2): F2 = F1 + �e1


Target

x D

en = D � dndnFn

Converged when e_n is small

Our approach: Non-linear Inverse Kinematics (IK) update

Angle command trial i

End-effector reference

Desired end-effector for IK

✓i

�

Yref

Yi End-effector position trial i

Learning gain

Y ides

F̂ Estimated Forward Kinematics Model

where:

Find non-linear IKwithin joint limits

Our approach: Non-linear Inverse Kinematics (IK) update

Inverse Kinematics computation

Use nonlinear constraint optimization for IK

Cost: Bias toward a pose

End-effector constraint: Satisfy estimated

end-effector position

Joint constraint: Satisfy joint limits

Inverse kinematics with Iterative Learning Control

• Bhounsule & Yamane, Humanoids 2015

Results for writing task

0 2 4 6 8 10 12 14 16 180

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Iteration Number

Aver

age

Squa

re E

rror,

norm

(rad

)2

−0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2−0.3

−0.25

−0.2

−0.15

−0.1

−0.05

x (m)

y (m

)0

ReferenceTrial 1Trial 18

−0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.10.24

0.26

0.28

0.3

0.32

0.34

0.36

0.38

x (m)

z (m

)

ReferenceTrial 1Trial 18

•Convergence: 18 trials •Trial 1: Error ~ 1e-3 •Trial 18: Error ~ 1e-5

Other tasks

• Bhounsule & Yamane, IJHR 2017

(3) Video Games

Flappy Bird Game (iPhone/android)

Control: Tap screen to navigate bird through pipes

Scoring: 1 point/pipe passed

Objective: Maximize points.

Flappy Bird GameHistory:

• May 2013: Game released

• Jan 2014: Most downloaded game on iTunes, earning $50,000/day (?)

• Feb 2014: Game removed from iTunes by developer citing its addictive nature

Flappy Bird, simple concept but difficult to achieve high score

How to beat Flappy Birddownloaded from a YouTuber

Past workMachine Learning

• Reinforcement learning,

• Q-learning, and

• Support Vector Machines.

• Select features,

• Learn state-action pairs

• Scores ~100-1500

Physics

Y - vertical height (up -)

V - vertical velocity

g - gravity (=0.1356)

z - control (flap or not flap)

constant horizontal velocity

#1: Heuristic controller & manual tuning

don’t jumpjump

PipePos1c

setPointYh

Set-point based control

Set-point is tuned.

Results: Heuristic controller & manual tuning

Average score 56.6/500

Results: Heuristic controller & manual tuning

#2: Optimization with manually tuned constraints

constraint lines

48

10

7

16

10

28

bounding box

NOTE: All dimensionsare in pixels

80

constraints

prediction horizon

optimized path

PipePos1

PipePos2

h

endYlow

endYhigh

endVel

Bounding box constraints Terminal constraints

#2: Optimization with manually tuned constraints

constraints

prediction horizon

optimized path

PipePos1

PipePos2

h

endYlow

endYhigh

endVel

Terminal constraintsMinimize number of jumps

Input: Jump or not (z=0 or 1 resp.) for horizontal distance bet. pipes.

Constraints:

1) Physics (big M method)

2) Bounding box constraint (pipes)

3) Terminal constraints (exit) [3 conditions parameters]

Mixed Integer Programming software Gurobi (intlinprog)

Results: Optimization-based control, optimization horizon fixed

Perfect score500/500

Results: Optimization-based control, optimization horizon fixed

#3: Model Predictive Control Same as #2 but with TWO differences:

1) no terminal constraint

2) prediction horizon (n),

control horizon is 1 step.

optimized path

constraints

control horizon

prediction horizon

[n is the only free parameter]

Results: Model Predictive Controller

0.625 0.6875 0.750 0.8125 0.8750 0.8750 1.000 1.0625 1.125Prediction Horizon (x Horizontal distance between pipes)

0

500

1000

1500

2000

2500

3000

3500

4000

4500

Score

Best fit line

exp(0.004 t^2 - 0.264 t - 3.6976)

Data

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

0.055

0.06

0.065

Ave

rage

Opt

imiz

atio

n Ti

me

(s)

Data

Best fit lineexp(0.002 t^2 - 0.2192 t + 11.7962)

0.625 0.6875 0.750 0.8125 0.8750 0.8750 1.000 1.0625 1.125Prediction Horizon (x Horizontal distance between pipes)Prediction HorizonPrediction Horizon

Scor

e

Com

puta

tion

time

Results: Model Predictive Controller (with optimum prediction horizon of 90)

Average score 419/500


80 time steps10 time steps

90 time steps

55 time steps

horizontal distancebetween pipes

Optimal prediction window is 90 ~1.125 horizontal distance between pipes

Key message: Need to plan slightly beyond the next pipe


Discussion

Heuristic controller Optimization MPC

Score (10 runs, max 500 pipes) 56.6 500 419

Worst case time (sec) ~0 1.3 3.9

Tuning Trial and error tuing

Trial and error tuning

Can be automated

Conclusion• Position and speed on exiting the pipe seems

to be key factors for good performance

• Optimization/MPC are too slow for real time implementation

• MPC best compromise between scores and need for intuitive tuning

Date post:	05-Oct-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

talk design control - College of Engineering€¦ · Flappy Bird Game History: • May 2013: Game...

Documents